類別理論 2.1:函數、外貌 (Category Theory 2.1: Functions, epimorphisms)

字幕列表影片播放

last time I defined a category for you, and explained that the most important
features of a category composability and identity and then
I started talking about the most important example of a category that we
use in programming, that's the category of types and functions, right. so in
programming we have types and functions.
But the model for types and functions is really sets, sets and
functions between sets, right. So I mean every time you design a language you have
to provide semantics for the language right?
What does it mean, certain things in language and a lot of languages are
defined using so-called operational semantics i mean the ones who actually
have semantics, right.
There are very few of them, none of the major ones. But they sort of like, have part of its semantics
defined right? And there are two ways of defining semantics one is by telling
how this thing executes right, so it's called operational semantics, because you're
defining operations of of the language. You know how one expression or
statement can be transformed into another simpler one and so on. And
the other is called denotational semantics where you actually map it into
some other area that you understand and in particular the most interesting
way of mapping this is into mathematics, right so you build a mathematical model you say "this statement in the
"language or this construct in the language corresponds some mathematical
"thing." and this mathematical thing for instance for types is a set of values. For
functions, it's a function between sets. Ok, that's why it's so important and that's why I
will just talk about type sometimes and sometimes I will say Set, sometimes i
will say type because I have this semantics in mind
Ok? So what is important is that a function in programming, especially when
you are considering you know imperative programming it's not exactly the same as
the function in mathematics right so we have to like be more specific by
"function between sets" we really understand something that in programming
would be called pure function right? And it will be a pure function and a total
function because a mathematical function is defined for every argument
not just for some arguments right? This is like the major source of problems in
programming that we have these partial functions, functions are defined for some
arguments and for other arguments they explode, right? I mean if they throw
exception that and then we catch this exception it's fine right but quite often
they don't they just
destroy the computer, explode and stuff. So, a total function is defined for all its
arguments right and these arguments are taken from some type so like if it's a
function on integers it has to be defined for all integers right? And
it's a pure function and how can you tell if a function is pure? So I have this test
for a purity of a function: a function is pure if you can memoise it which means
ah you know you can turn it into a lookup table right? Because for every
value of an argument it should return one particular value right? And you can
just remember this value, okay, you call this function once it evaluates it and then
you can remember the next time you do a lookup into a table right? So if you have
like you know, types that are finite that only have a finite number of
elements like boolean right then it's really easy to tabulate them,
functions on characters if it's just ASCII characters, they are really easy to tabulate
right, all these functions like `isAlpha`, you know, or `isPrintable`, they
are all actually tabulated, so they are very fast.
Now functions on, let's say integers or strings, they usually cannot be
tabulated right but that's only a problem because of resources right? If we
had infinite resources we could tabulate it. So ask yourself the question "can I
actually memoise this function?" and for instance a function like `getChar` cannot
be memoised, right? You call it once and then you say "Oh, it returned 'h', good.
"I'll just remember it and from now on whenever somebody calls function getChar I will return 'h'
"and that's it". No, you can't do this, right, so this is not a pure function. Now of course
you might say "okay, but how can you program using only pure functions" right?
We need things like side effects
So all this stuff can be described on top of pure functions, but
what we really want is to get to the bottom of the abstraction like what is the
lowest level (or the highest level of abstraction depending on how you look at it) right?
so what is the atom, what are the building blocks
what are the simplest building blocks from which we can build more
complex stuff right? Using again the same idea of composability so first we want
to decompose the problem, get to the little blocks at the bottom, and then
recompose stuff from there.
So when we are decomposing this idea of using procedures and data types and so
on, we eventually get to the bottom and that's pure functions. So on top of your
functions we can be building more complex stuff, including things like I/O
So now that we have this category of types and functions, right and I talked a little
bit about this, now let me expand on functions, because functions are very
interesting, there's much more to know about functions than we usually consider
right? So they are really interesting beasts. Even these pure
functions right? Total pure functions. We have to look at them from a
slightly different perspective- from a perspective of how we can use them
as morphisms in our category right? The category Set
which contains sets and functions between sets.
So functions are defined in mathematics as special kinds of relations
ok what's a relation? Relation is, you know, take two sets they have elements
right. Now we're talking about set as in set theory, not as a category. In a category
won't be able to look at elements, but in set theory we can look at elements right? A relation is
just a subset of pairs of elements
So it's just a pairing of things, we say this element in a relation
with this element, ok? And again as I explained we are using our brains to
understand these things and here we are like talking about relationships so
this is, you know our relationship frame, talking about social
interactions in mathematics.
Yes you have a question > Does the relation have to be total?
In general? No.
We're just talking about general relations in mathematics right?
So we say ok, this person is in a relation with this person and they are, lets say friends.
So this guy's a friend of this guy maybe this guy's not a friend of this
guy so the relation doesn't have to be symmetric you know? So, let me define a
relation okay?
First of all what is a set of pairs? A set of pairs in mathematics is called a
Cartesian product, right? So we have two sets, S1 and S2, and we can do
a Cartesian product of it so all elements from one set are paired with
all possible evidence from the other set so the set of all pairs forms the
Cartesian product. Now we take a subset of this Cartesian product, we just pick
some pairs, randomly or not, you know, if they are meaningful. We just
pick a subset and any subset of this Cartesian product is a relation by
definition that's a relation ok? So there are no other requirements, just a subset of
the Cartesian product, that's a relation.
So in this sense relation does not have a directionality, right, whereas functions
functions have direction functions are these arrows so this is one of the most
important things to understand, that functions have some kind of
directionality. So what kind of condition do we have to impose on the
relation for it to become a function? And in a relation we can have this element
you know, pick an element in this set, it can be in relation with many elements in this
other set right? And vice versa, an element from set can be in relation–
(well okay I'm drawing, putting these arrows in this direction). So many elements from set S1 can be
in relation with a single element in set 2, or one element from set 1
can be in relation with many elements from set 2, both are ok for every
relation. One of them is not okay for functions.
(I need an eraser)
I'm not gonna use my hand!
Which one is bad for a function? The top one right? One element, one argument of a
function cannot be mapped into a bunch of things. It has to be mapped to one thing
ok? It's still ok for multiple arguments to be mapped into the same value for a
function, but not the other way. So that's where the directionality comes from. Last,
we want to be considering total functions for our category right? So this
whole set has to be mapped. All elements of this set must be mapped into something in this other set
Ok? However it's not true that all the elements from this set have to
counter, you know, have to come from this set. So this could be a subset that's being mapped
Ok? Now these things have names so let me name these things for future reference
so that we can talk about them easily. So if you have a function f,
(so this is like a graph of the function f) this is called the domain
of a function, right? The domain of the function is just this set from
which we have started, the whole set.
that's its domain. This set is called the codomain.
And this subset that was obtained by mapping the whole domain is usually
called the image of the function.
Ok?
So functions are not symmetric in this way, you
see? This is the whole set, this could be a subset. Multiple elements can be
merged into one element but not vice versa.
So there is this directionality and this directionality is very important
That's a very important intuition about functions, and this is
not only an intuition about functions because later we'll see that in
category theory also we have all kinds of other mappings, we have mappings between
categories called functors, we have mappings between functors which are called natural transformations
and so on. All these mappings have the same kind of
directionality, ok? And the simplest way to understand this directionality
is asking "is the function invertible?". Like, if I go from here to here is it
possible to go back?
And usually it's not. So if there is a function f from some A to B, you
know, there isn't always a function that goes the other way around, that is the
inverse of this function. How do we define an inverse of a function?
Well, if you have a function that goes from A to B, and I will be
using this notation that is just taken from Haskell,
So f is a function, double colon means "this is the type of function",
the type of function, it goes from a to b. Alright so this is set A, set B, type a type b.
Ok? So this function is invertible if there is another function that goes in
the other direction, from b to a. That's the inverse of this one, what does
it mean? It means that g after f equals id.
Ok? So if you first go f to B and then go back from B to A using g, you should end up
in the same, exactly the same point. So if you have an x here, x maps it to
some y here and then you apply g to y, you should get back to x ok? That
means the inverse, ok, but we can compose them two ways right? So g after f is one,
but there's also a composition f after g and this one also has to be an identity.
So we go g first and then f, and you should end up at the same point ok? So this makes it
symmetric. So a function that's invertible is automatically symmetric ok?
And the function that is invertible is called isomorphism.
So this is the definition of isomorphism and notice an interesting thing:
this is in the language of categories.
Maybe with a little bit of additional notation, if f goes from A to B and g
goes from B to A then this is an identity in which object? In A. Right? So
Now let's forget about this picture and say A, B, this is f, this is g.
Ok? So g after f, I'll get identity on a. and then f after g, is the same as
identity on b. Okay, and I have written this in categorical language, I'm not talking
anymore about elements, right? Here I was talking about elements of sets, I
don't need to. Okay? Now I can go back to categories and say "this is a definition
of isomorphism in any category". Not only in a category of sets, in any category because
it's expressed in terms of composition and identity, nothing else, right? So f is
an isomorphism if there exists a g with these properties
Ok?
So isomorphisms are great because isomorphisms, like, identify two
sets they say "this set really looks like this set", that there's like one to one
mapping between elements of these sets right? So that's, you know, if this is a two
element set and this is a two element set, you know I have a mapping between
these elements. That's great, right? For my purposes they are sort of like
identical right? It's not really identity, it's isomorphism, but I have an
identification between these two sets. So for finite sets you know an isomorphism
is just an identification of elements, this element corresponds to this, this
corresponds to this and so on,
one-to-one correspondence. For infinite sets, of course, this is a little bit more
complicated. For instance you can have a one-to-one correspondence between
natural numbers and even numbers.
y = 2x
even numbers. Or odd numbers
Right? These are functions that are invertible. Is that true?
No, but if this is a set of even numbers, right, so let's say this is a
set of even numbers (not the set of all numbers, just a set of even numbers) and this
is the set of all numbers, and then I have one-to-one correspondence between
even numbers and all numbers, right?
That's because they are infinite and for infinite sets you can do tricks like
these, but if this is a natural number and this is, say, you know
real numbers, right? Real numbers.
Now you know that there is no correspondence like this right? Do you know that?
There's a good argument, that [...]
But that's just a side remark, right. So just to make you aware of the fact that
isomorphisms kinda tell you that these two things are for some intents and purposes identical
But most functions are not isomorphisms, right?
Most in a sense of, I dunno, most functions that we deal with are not
isomorphisms and there are two reasons for a function not to be an isomorphism,
Ok? So you have to have good reasons not to be an isomorphism. One reason is because
the function can collapse elements of a set, right? So it takes multiple
multiple elements of this set are mapped to a single element of this set, so let me be
more precise here, just focus on one thing
so you have some x_1 and you have some x_2, and function f maps both of these elements to the same y.
Ok? For instance, let's see, a function `isEven`, which returns a boolean, right?
So `isEven` maps every even number into this one element of the boolean called true.
All even numbers are mapped to true, all odd numbers are mapped false so lots,
lots of numbers are mapped into one value. So we have this kind of gluing
going so that's one reason for this function not to be invertible. There's
another reason for it not to be invertible and that it doesn't–
that its image does not fill the whole codomain, right? So you have a function in which
No, so this is the codomain but there are some points here, this is the image
of function f, there are some points outside of the image that
just don't correspond to any x's here, right?
So if you would want to invert this function you would have to to create a
function g and what's the value of g on this point right? maybe you can invert
for these points you can say okay this x is mapped to this y
ok maybe I can invert it, if I don't have this gluing right? Then maybe I can invert it,
but I don't know what to do with this guy, what's should g be on this? No idea.
Yes? > Isn't it possible to have an inverse relation on this [...] or do we not care about those?
Yes, yes, yes and in fact, you know at some point people have this,
instead of inverting functions what you could say is that the counter image of
this y, right, it would–
Ok, in this case, right, it will be some bunch of points here, right? And this bunch of
points that are mapped to the same point.
It's called a fibre ok? And fibration is a very interesting topic in category theory.
Because then you can build a set of fibers here and you can say "Okay, on
a set of fibres this thing is invertible". I'm getting really really ahead now, ok
So there are these two reasons for a non-invertibility, so I like to think of this
like a function, if you think of a function as something directional
something, a process that takes place in time, a function that's not invertible
is something that increases entropy, right? It boils an egg and you
cannot un-boil an egg, right? So these kind of things make, you know, make this thing
non-invertible. And it's very important because these two things, these two
phenomena here that make it non-invertible, they correspond to a very
interesting thought process that we use to solve problems.
This corresponds to abstraction, this means, you know "I really don't care which
"of these points I came from". I'm only interested in one property there, so I'm
abstracting, I'm throwing away some information about whether it was x_1 or
x_2 and I'm left with just one piece of information. For instance I'm throwing away
information about what was the exact number that I mapped, i'm just extracting the
information whether it was even or odd.
Ok? So I'm sifting information, right? And I cannot invert it, I cannot say "Oh, it
"was an even number so it must be two, must be zero." No, right I don't know how
I came to this conclusion, but I have abstracted a bunch of information and now
I have just this one piece of information that I care about.
Ok? So this is the process of abstraction. This on the other hand, is the process of
modelling, ok? I'm modeling this set inside this bigger set, or maybe it's not
bigger. right, but since it has this,
this environment, right. So I'm saying "okay I'm recognizing the pattern that's
"defined by this set inside this second set."
So I'm mapping it, so I'm sort of, like injecting this right here, right? This is some
stick figure or something, and I see the same stick figure here.
Ok? So you know this is maybe a human sitting in a cave, and this is the
shadow on the wall of the cave. I'm embedding it in here.
And I'm abstracting maybe also, right? Because this is a
three-dimensional guy, this is a two-dimensional shadow and doesn't
have, you know, so I'm doing both abstraction and modelling. And then
somebody comes and recognizes this
"Hey you just drew a person!"
Oh! The first human being drew the person on the wall and they are excited. So they
just discovered category theory.
Ok so mathematicians of course have names for these things, right? But they
their names are like the inverse what I described so, like, they say
"If a function does not collapse things then its called injective, or injection"
An injective function does not collapse
things ok? So if you have two different x's, x_1 and x_2, and you map them with a
function then you will always get y_1 and y_2, they will be different so if x_1 is
different from x_2 and–
ok let me write fx_1 and fx_2. So if these guys are different, then these guys
are different too. Which means it's not collapsing anything, right?
It's just injecting the thing as is, right? The whole thing with all the
points in here get transported to this other set.
No shrinking, no forgetting, no abstraction so thats an injective function.
It just injects.
And this thing also they have a name for the opposite of this, namely, if the
function covers the whole codomain, right, the image is equal to the codomain, then
it's called surjective.
And that's from I don't know French or Latin "sur-" means "on" right so it's "onto".
It's a function onto the other. So this one is a counterexample for a
surjection right? That's not a surjection because it has this terra
incognita here. A surjection doesn't have this, a surjection covers the whole thing right?
And in particular in set theory, you know, if you have a function that's both
injective and surjective at the same time it's actually an isomorphism, you
can invert ok? So that's a good thing to know. But now you might say "Okay,
"so we know something about set theory, that there are injective functions and
"surjective functions". Now if we define this in terms of elements, right I said
elements here, right?
And surjected it means that for all ys in the second set, y
equals f_x right? There exists an x. For all y's there exists an x, that y is equal to f_x.
That's in terms of elements of a set. Like, there is no thing outside.
Every element that I pick in this set has a counter-image, ok? So I have
defined something in terms of elements. And now if I go to category theory, you know, the Set category,
how do I talk about injections and surjections, right? If I cannot look at the elements.
Turns out I can, but it's a tricky business because I have to express this
stuff now only in terms of morphisms, right, of other morphism so I have to,
like, turns out I have to talk about all the other objects in the category, all of
them, in order to define this one property.
And this is very very characteristic to category theory, if you want to
define a property of an object or an arrow, you define it with respect to
essentially everything else, ok? So this is a very holistic approach, right? You
cannot just focus on one little thing because no matter how good your
microscope is you cannot look inside this little point, right? Where a is, right?
And here's b. Because in category theory this is a point, here it's a big thing I can look at it
using a microscope, so if my microscopes don't work, maybe my
telescopes will work. I'll be looking at the whole universe and I will say
"this little guy is in a relationship with the universe" and the
relationship of this guy with the universe is such that I can say
something of this type, like it's sort of injective or surjective, except that in
category theory, we don't like Latin, ok? We use Greek. So instead of saying something
is surjective, you know "sur-", "onto" and in Greek it's "epi-", right?
so we'll say that something is surjective it's called "epic"
ok and something that's injective it's called "monic".
Or monomorphism, monic thingy, or monomorphism
and it's can be called epimorphism.
I'm getting outside of my area.
Or an epic morphism okay? So epimorphism and surjective function: it's
the same thing when you consider just set theory, right? And injective and
monic is the same in set theory. But these things are defined for any
category, so they are much more general. So how would we define this stuff, ok, let me
erase this maybe, we remember this, it's an easy thing.
How would we define, let's say an epimorphism? I think that's easier.
So we have a function from A to B, sorry not a function, a morphism,
ok but let's let's, first thing about the function but let's try to define
this property of a function that's surjective in such a way that we don't talk about elements.
Ok so first we'll start with a set, but we'll try to find the property of
a surjective function that can be expressed purely in terms of other functions
So let's first think about– if it's not surjective what does it mean? If it's not
surjective it means that there are guys here that are not in the image, right? so
suppose that I take another function called g from this set, so this is my set A, this is my set B
let's say I take a set C, whatever set, right, and the function g from B to C. So
as you can see function g will have values for elements that are in this terra
incognita and also for these elements that are in the image of f, right? But if I
compose these two, and here's the trick, I have to talk about composition because
composition is the only thing i have at my disposal in a category. So I better
talk in terms of composition, right? So if I talk in terms of composition, if I
compose this function f with g then look what happens: this x will go inside the
image and g will take it to some– so this is, lets say y, this is some z, right
So g after f will actually not probe this
terra incognita, right? Even though g maps everything, inside composition it will
actually only act on this, right? Inside the composition. Because it's acting on
y's and these y's were obtained by acting on x's using f.
That's how a composition of functions is defined in Set. Aha. So if I take two
such functions, g_1 let's say, and another function g_2, and these functions only
differ here, so g_1, maps this guy
into this point and g_2 maps this guy then g_1 is different from g_2 right?
They map points differently, or actually what i want to say is the same
point is mapped by g_1 and g_2 to something different, right?
That's the difference. The same point is mapped to different points, that means g_1
and g_2 are different functions. But if they only differ outside, in this halo, the
composition with f will be the same. So the composition g1 after f
will be the same as g_2 after f, even though g_1 is different from g_2. So
the converse of this is if for every g_1 and g_2
If g_1 after f equals g_2 after f
It follows that g_1 is equal to g_2
for every such combination then, I can say this function is surjective, right I'm
just like reversing the logic. If, for any object C and any pair of functions
g1 and g2 from B to C, if g_1 after f is the same as g_2 after f, then
g_1 must be equal to g_2. If this is true, then the function is surjective. And look at what
happened, now I have expressed this purely in terms of categorical terms
And maybe I should put one more, for every c for all c's, for all objects
c's for all morphisms g_1 g_2 that go from B to C if from g_1 f equals g_2 f
it follows that g_1 is equal to g_2, then this is an epimorphism. And this
definition will work for any category. In the category of Set, this is just exactly the
same as a surjective function, but in any category I can define something like
this because I'm only using composition. But notice that I'm using
quantifiers "for all". So I have to consider all objects in my category all
c's, and all possible pairs of morphisms from B to C, right? So I'm really,
really looking at the whole universe just to define this one property of f. And
the way to remember this property, one way of looking at this, is that if i have
this kind of relation it means that i can i can cancel something on the
right, it means if I have g_1 after f equals
g_2 after f, then I can cancel f. And what do I get? g_1 equals g_2. Right? This is this is
actually a good point for a break