In-Depth
Text Processing, Type Definition, I/O and Visualization in F#
What you can do with most programming languages can be accomplished in F#'s functional programming paradigm. Here's how to handle some simple operations, which might look familiar to you already.
- By Arnaldo Pérez Castaño
- 04/20/2016
In prior articles (here and here) I described some interesting and useful features that the functional programming paradigm -- and specifically F# -- can offer when developing computer applications. I'd like to delve into a bit of detail on some useful examples this time. Specifically, let's look at text processing, type definition, input/ output and visualization.
Text Processing
Texts in computer science is represented in strings. Simply, a string is defined as a finite sequence of characters belonging to an alphabet. All text processing tools or operations must rely on strings as their basic processing unit, and one of these tools is the regular expression. A regular expression is a pattern or expression that implicitly describes a set of strings that belong to a regular language; thus, regular expressions represent regular languages. This powerful string processing tool is part of the .NET platform and it's also part of the F# language.
To be able to use regular expression in F# one first needs to open or import the System.Text.RegularExpressions namespace. Here's an example of a regular expression that recognizes any word in lower case letters being created:
open System
open System.Text.RegularExpressions
[<EntryPoint>]
let main argv =
let reg = new Regex("[a-z]+")
printf "%A" (reg.IsMatch("jordan"))
// Read user input
let input = Console.ReadLine()
0 // return an integer exit code
The IsMatch method returns true if any substring of the string supplied as an argument matches the regular expression pattern ([a-z]+). Hence, after executing that code the output will be true. In case the string passed as an argument changes from "jordan" to "23" then no matching can be found and the return value will be false.
Text transcription is an operation that can be easily achieved using regular expressions. This code illustrates a transcrip function that takes arguments str (the string to be transcript) and pattern (the pattern to be matched and replaced):
let transcrip str pattern = (new Regex(pattern)).Replace(str, "23")
printf "%A" (transcrip "Hi Mr. Jordan!" "Jordan")
After executing this code the result is shown in Figure 1.
Regular expressions can also be use to separate, split or divide text under a certain criteria. One may need to split text by the occurrence of numbers, specific letters, etc. In such cases a regular expression and the execution of the Split method, which returns an array of strings, could serve as a solution.
In this code, the regular expression splitreg matches the empty string or any digit:
let splitreg = new Regex(" |[0-9]")
let splitted = splitreg.Split("1 2 3 Mr.Jazz!")
printfn "%A" splitted
The Split method divides the input string every time it finds an empty string or a digit. Since there are three digits and three empty strings from left to right before reaching substring "Mr.Jazz!", and they are all consecutives, the output results in what you see in Figure 2.
After exploring the possibilities of text processing in F# let us dive into the opportunities for defining custom types.
Type Definition
F# provides the possibility of defining custom types by using the type keyword followed by the type name, an equal sign and the definition itself. A type definition can be used for defining aliases for existing types. This type of definition includes tuples and could be useful and meaningful as shown in the next example where a type person consisting of a 2-tuple string is created and later used as a type constraint:
type person = string * string
let f (p : person) =
printfn "%A" p
The syntax x1 * x2* … * xn indicates an n-tuple.
Another type definable in F# is the record type; similar to a tuple because it groups several types into a single type but dissimilar in the sense that names must be provided for each field. The next example illustrates the creation of a record named album and its later use.
type album = { year: int list; artist: string list}
let jazz = { new album with year = [1998; 1999; 2000] and artist = ["Sting"; "Duke Ellington"; "Chris Botti"] }
Records and aliases are similar to structs and classes in C#. Apart from these two there is one last type that one may define in F#: the union or sum type.
Union types represent a manner for uniting types with different structures. Their definition consists of a set of constructors separated by a vertical bars. A constructor is composed of a name (in capital letters, unique within a type), optionally the of keyword and finally the types that form the constructor separated by asterisks, in case there is more than one type. This code illustrates how to define a Binary Tree union type:
type BinaryTree =
| Node of int * BinaryTree * BinaryTree
| Empty
Logically, a binary tree is either an Empty tree or a Node with an integer value and two children which are also binary trees. Considering the binary tree in Figure 3.
A declaration for this structure is here:
let tree = Node(2, Node(3, Node(4,Empty,Empty), Empty), Node(5,Empty,Empty))
A function defining an in-order printing of node values in the tree is presented in the next lines of code:
let rec printInOrder tree =
match tree with
| Node (data,left,right) -> printInOrder left
printfn "Node %d" data
printInOrder right
| Empty -> ()
printInOrder tree
The output after executing the previous function is shown in Figure 4.
An in-order path in a binary tree recursevely visits the left node, then the current node and finally the right node thus for the previous tree the in-order path is 4, 3, 2, 5.
Input / Output
I/O features are not highly related to the functional programming paradigm. But F# is not purely functional – rather, it's a hybrid and accordingly it incorporates features from the imperative programming paradigm. Several I/ O functions provided in the language corroborate this statement. The printfn function used in this and prior articles represents probably the most common alternative for printing various types in F#; printfn is a simple function that provides simple formatting. Here are its differentiating format indicators:
- %d print an int
- %s print a string
- %A print any type using the built in printer
- %f print a float
- %g print a float in scientific notation
- %O print any type using the ToString() method
In the next example (with result shown in Figure 5), a string, an int and a float are printed using the printfn function:
printf "String: %s int: %d, float: %f\n" "NBA" 5 2.3
Reading from and writing to files can be achieved in F# thanks to the functionality offered by the .NET platform through the System.IO namespace. The function shown in the following code loads the file indicated in the filename parameter then reads the file and yields each line until the end of the stream, all of this inside a sequence expression so the final result is a sequence with lines from the file.
let readfile filename =
seq { use stream = File.OpenRead filename
use reader = new StreamReader(stream)
while not reader.EndOfStream do
yield reader.ReadLine() }
let lines = readfile @"F:\text.txt"
printfn "%A" lines
The use binding is equivalent to the let binding except for the fact that the first disposes the object once the enumeration over the sequence has been completed, closing the file at the end.
Visualization
Providing visualization tools, libraries are not really the strong suit for functional programming languages. Then again, F# is not purely functional and is a .NET language, so it can deal with Windows Forms and many graphics libraries like DirectX, OpenGL, etc. To start building a GUI application in F#, one first needs to open namespaces System.Windows.Forms and System.Drawing, adding their reference to the project if necessary, and then creating the form (result is Figure 6):
let form = new Form(BackColor = Color.LightBlue, Visible=true, Text="My F# form")
Application.Run(form)
Once the form is created, this code adds a picture box as a new control (see Figure 7):
let picturebox = new PictureBox(Width = 200, Height = 200, BackColor = Color.LightGray)
form.Controls.Add(picturebox)
Finally, using a solid brush a black ellipse is drawn inside the picture box and an almost effortless painting on F# is completed (see Figure 8):
let brush = new SolidBrush(Color.Black)
picturebox.Paint.Add
(fun e ->
e.Graphics.FillEllipse(brush, 50, 50, 50, 50))
In this article I described the capabilities provided by F# to handle text processing, type definition, I/O operations and visualization through Windows Forms. Now it's time for reader to keep exploring the fascinating, elegant world of functional programming.
About the Author
Arnaldo Perez Castano is a computer scientist based in Cuba. He is the author of a series of programming books -- JavaScript Facil, HTML y CSS Facil, and Python Facil (Marcombo S.A). His expertise includes Visual Basic, C#, .NET Framework, and artificial intelligence, and offers his services as a freelancer through nubelo.com. Cinema and music are some of his passions. Contact him at [email protected].