开发者

Most idiomatic way to write batchesOf size seq in F#

开发者 https://www.devze.com 2023-04-06 19:40 出处:网络
I\'m trying to learn F# by rewriting some C# algorithms I have into idiomatic F#. One of the first functions I\'m trying to rewrite is a batchesOf where:

I'm trying to learn F# by rewriting some C# algorithms I have into idiomatic F#.

One of the first functions I'm trying to rewrite is a batchesOf where:

[1..17] |> batchesOf 5

Which would split the sequence into batches with a max of five in each, i.e:

[[1; 2; 3; 4; 5]; [6; 7; 8; 9; 10]; [11; 12; 13; 14; 15]; [16; 17]]

My first attempt at doing this is kind of ugly where I've resorted to using a mutable ref object after running into errors trying to use mutable type inside the closure. Using ref is particularly unpleasant since to dereference it you have to use the ! operator which when inside a condition expression can be counter intuitive to some devs who will read it as logical not. Another problem I ran into is where Seq.skip and Seq.take are not like their Linq aliases in that they will throw an error if size exceeds the size of the sequence.

let batchesOf size (sequence: _ seq) : _ list seq =
    seq {
        let s = ref sequence
        while not (!s |> Seq.isEmpty)  do
            yield !s |> Seq.truncat开发者_开发知识库e size |> List.ofSeq
            s := System.Linq.Enumerable.Skip(!s, size)
    }

Anyway what would be the most elegant/idiomatic way to rewrite this in F#? Keeping the original behaviour but preferably without the ref mutable variable.


Implementing this function using the seq<_> type idiomatically is difficult - the type is inherently mutable, so there is no simple nice functional way. Your version is quite inefficient, because it uses Skip repeatedly on the sequence. A better imperative option would be to use GetEnumerator and just iterate over elements using IEnumerator. You can find various imperative options in this snippet: http://fssnip.net/1o

If you're learning F#, then it is better to try writing the function using F# list type. This way, you can use idiomatic functional style. Then you can write batchesOf using pattern matching with recursion and accumulator argument like this:

let batchesOf size input = 
  // Inner function that does the actual work.
  // 'input' is the remaining part of the list, 'num' is the number of elements
  // in a current batch, which is stored in 'batch'. Finally, 'acc' is a list of
  // batches (in a reverse order)
  let rec loop input num batch acc =
    match input with
    | [] -> 
        // We've reached the end - add current batch to the list of all
        // batches if it is not empty and return batch (in the right order)
        if batch <> [] then (List.rev batch)::acc else acc
        |> List.rev
    | x::xs when num = size - 1 ->
        // We've reached the end of the batch - add the last element
        // and add batch to the list of batches.
        loop xs 0 [] ((List.rev (x::batch))::acc)
    | x::xs ->
        // Take one element from the input and add it to the current batch
        loop xs (num + 1) (x::batch) acc
  loop input 0 [] []

As a footnote, the imperative version can be made a bit nicer using computation expression for working with IEnumerator, but that's not standard and it is quite advanced trick (for example, see http://fssnip.net/37).


A friend asked me this a while back. Here's a recycled answer. This works and is pure:

let batchesOf n =
    Seq.mapi (fun i v -> i / n, v) >>
    Seq.groupBy fst >>
    Seq.map snd >>
    Seq.map (Seq.map snd)

Or an impure version:

let batchesOf n =
    let i = ref -1
    Seq.groupBy (fun _ -> i := !i + 1; !i / n) >> Seq.map snd

These produce a seq<seq<'a>>. If you really must have an 'a list list as in your sample then just add ... |> Seq.map (List.ofSeq) |> List.ofSeq as in:

> [1..17] |> batchesOf 5 |> Seq.map (List.ofSeq) |> List.ofSeq;;
val it : int list list = [[1; 2; 3; 4; 5]; [6; 7; 8; 9; 10]; [11; 12; 13; 14; 15]; [16; 17]]

Hope that helps!


This can be done without recursion if you want

[0..20] 
    |> Seq.mapi (fun i elem -> (i/size),elem) 
    |> Seq.groupBy (fun (a,_) -> a)
    |> Seq.map (fun (_,se) -> se |> Seq.map (snd));;
val it : seq<seq<int>> =
  seq
    [seq [0; 1; 2; 3; ...]; seq [5; 6; 7; 8; ...]; seq [10; 11; 12; 13; ...];
     seq [15; 16; 17; 18; ...]; ...]

Depending on how you think this may be easier to understand. Tomas' solution is probably more idiomatic F# though


Hurray, we can use List.chunkBySize, Seq.chunkBySize and Array.chunkBySize in F# 4, as mentioned by Brad Collins and Scott Wlaschin.


This isn't perhaps idiomatic but it works:

let batchesOf n l = 
    let _, _, temp', res' = List.fold (fun (i, n, temp, res) hd ->
                                           if i < n then
                                             (i + 1, n, hd :: temp, res)
                                           else
                                             (1, i, [hd], (List.rev temp) :: res)) 
                                       (0, n, [], []) l
    (List.rev temp') :: res' |> List.rev


Here's a simple implementation for sequences:

let chunks size (items:seq<_>) =
  use e = items.GetEnumerator()
  let rec loop i acc =
    seq {
      if i = size then 
        yield (List.rev acc)
        yield! loop 0 []
      elif e.MoveNext() then
        yield! loop (i+1) (e.Current::acc)
      else
        yield (List.rev acc)
    }
  if size = 0 then invalidArg "size" "must be greater than zero"
  if Seq.isEmpty items then Seq.empty else loop 0 []

let s = Seq.init 10 id
chunks 3 s 
//output: seq [[0; 1; 2]; [3; 4; 5]; [6; 7; 8]; [9]]


My method involves converting the list to an array and recursively chunking the array:

    let batchesOf (sz:int) lt = 
        let arr = List.toArray lt

        let rec bite curr =
            if (curr + sz - 1 ) >= arr.Length then
                [Array.toList arr.[ curr .. (arr.Length - 1)]]
            else
                let curr1 = curr + sz 
                (Array.toList (arr.[curr .. (curr + sz - 1)])) :: (bite curr1)   

        bite 0

    batchesOf 5 [1 .. 17]

    [[1; 2; 3; 4; 5]; [6; 7; 8; 9; 10]; [11; 12; 13; 14; 15]; [16; 17]]


I found this to be a quite terse solution:

let partition n (stream:seq<_>) = seq {
    let enum = stream.GetEnumerator()

    let rec collect n partition =
        if n = 1 || not (enum.MoveNext()) then
            partition
        else
            collect (n-1) (partition @ [enum.Current])

    while enum.MoveNext() do
        yield collect n [enum.Current]
}

It works on a sequence and produces a sequence. The output sequence consists of lists of n elements from the input sequence.


You can solve your task with analog of Clojure partition library function below:

let partition n step coll =
    let rec split ss =
        seq {
            yield(ss |> Seq.truncate n)
            if Seq.length(ss |> Seq.truncate (step+1)) > step then
                yield! split <| (ss |> Seq.skip step)
            }
    split coll

Being used as partition 5 5 it will provide you with sought batchesOf 5 functionality:

[1..17] |> partition 5 5;;
val it : seq<seq<int>> =
  seq
    [seq [1; 2; 3; 4; ...]; seq [6; 7; 8; 9; ...]; seq [11; 12; 13; 14; ...];
     seq [16; 17]]

As a premium by playing with n and step you can use it for slicing overlapping batches aka sliding windows, and even apply to infinite sequences, like below:

Seq.initInfinite(fun x -> x) |> partition 4 1;;
val it : seq<seq<int>> =
  seq
    [seq [0; 1; 2; 3]; seq [1; 2; 3; 4]; seq [2; 3; 4; 5]; seq [3; 4; 5; 6];
     ...]

Consider it as a prototype only as it does many redundant evaluations on the source sequence and not likely fit for production purposes.


This version passes all my tests I could think of including ones for lazy evaluation and single sequence evaluation:

let batchIn batchLength sequence =
    let padding = seq { for i in 1 .. batchLength -> None } 
    let wrapped = sequence |> Seq.map Some
    Seq.concat [wrapped; padding]
    |> Seq.windowed batchLength 
    |> Seq.mapi (fun i el -> (i, el)) 
    |> Seq.filter (fun t -> fst t % batchLength = 0) 
    |> Seq.map snd
    |> Seq.map (Seq.choose id)
    |> Seq.filter (fun el -> not (Seq.isEmpty el))

I am still quite new to F# so if I'm missing anything - please do correct me, it will be greatly appreciated.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号