0:00
[music] In this segment I want to continue talking about subtyping but now talk about
object oriented programming. This is a good way to do it.
We've learned the theory of how subtyping should work for records and for functions.
We can actually use that to understand how class-based object oriented programming
can have a static type checker. This is really how the core techniques
used to type check languages like Java and C#.
So as I mentioned back when we studied interfaces, in these languages, the class
names are also types. So Ruby doesn't have types, but in these
languages, when you declare a class C, you also get a type c.
And the subclass relationship is also the subtype relationship.
If, c is a subclass of d then the type c is a subtype of d.
And because we have transitive subtyping. In fact, you are a subtype of anything
higher than you. In the subclass hierarchy all the way up
to object at the top. So if we're going to do this we need to
obey our substitution principle and we have to make sure that any instance of a
subclass could be used anywhere an instance of a superclass appears Without
anyone ever calling a method that doesn't exist or accessing a field that doesn't
exist. Where fields are like instance variables
but in these languages they actually are part of the class definition so they're
part of the type that comes along with those classes.
So the way to think about this, the way we studied sub-typing, is that objects are
essentially records. They're a little different.
I'll explain that at the end, but they have a bunch of fields in them.
They have a bunch of methods in them. And you can think of the variables for the
field name and the name of the method as just the names of the, the record slot.
It's the record fields. And the thing that drives or subtyping is
that the fields are generally mutable, so we know that depths subtyping is not going
to be sound on those. But the methods are typically immutable,
so we can have subtyping on the methods. The methods are like functions, so our
rules for contra variance and covariance flipping around the arguments same way for
the result, is how can we reason about, how a sub class can change the type of a
method. So you really could design a type system,
for an object-oriented programming language, using this idea of record types.
In your subtypes, you would allow extra fields and methods, just like subclasses
always do. You're allowed to add new fields and
methods. And we know.
From our study of width subtyping that, that will be sound.
That we can use an instance of the subclass as though it's an instance of the
superclass because you know, no code will care if the object has some extra things
that the type doesn't promise that it has. And we know that if you're overriding a
method Did. That the new method, which is like a
function, better be able to be used wherever the superclasses method could be
used. And that means the arguments will have to
be contravariant. In the subtype, the arguments will have to
be supertyped of what they were in the superclass.
But the return type is covariant. There, the jargon means that the subclass,
the overwriting method, can use a subtype of the type that the superclass method
used for the return type. And this is all sound related
to[INAUDIBLE]. Because you can't update a method to be
some other method in these languages. So that would all work.
It turns out if you actually look at the Java and in C Sharp.
It kind of does this. And then it makes some slightly different
decisions. And that's fine.
So it doesn't use types like our record types.
You never see something like X:real, Y:real.
They just reuse the class names, or with interfaces like I showed you, the
interface name. So they have names for types.
Rather than writing out the contents of the classes, that's okay.
That actually does restrict sub-typing. So, suppose I had a colour point class and
I had a point class, but I did not say that colour point was sub-class of point.
I just re-implemented everything that point had.
Well, from a type soundness perspective it would be fine for color point to be a
sub-type of point but in Java and C# it does not get to be.
If you want to be one type to be sub-type of another type and those types are both
names of classes you have to actually be realted in the sub-classing relationship,
you can't just happen to have all the things that the other class.
Has. And so that's a subset of what would be
Okay, and whenever you're more restricted than what would be Okay, that's Okay.
So, we know that a sub-class, number 2 here, can add fields and methods.
That works fine, that's exactly how these languages work.
And we know that a subclass can override a method and make the return type a
sub-type, and these languages allow that. Now we could allow an overriding method to
change the argument types, as long as we are contravariant.
We are flipped around. But these languages choose not to do that.
They make a different design choice. They say that if you change the argument
types, you're not actually overriding at all.
Instead you're just creating a different method with the same name.
And we know that you can add methods so, that doesn't cause a problem and this
issue of having multiple methods with the same name as this static overloading I've
mentioned once or twice in the lectures and I'm not going into the details of
exactly. How it works because it's just complicated
without adding that much, interesting concepts to the study of programming
languages, okay? If I could get you to improve your
terminology in one way related to object orientated programming, it would probably
be to teach you the distinction between classes and types.
And this is a little difficult because Java and C# purposely confuse them by
making every class also be a type. But let me give it a try anyway.
A class is what defines an object's behavior.
These are the things we defined in Ruby, right?
Every object has a class, a class defines behavior.
A class has method definitons with bodies that have code that return things.
A type Describes an objects methods, arguments and result types.
It says, some object that has this type has a method foo that takes a string and
returns an object or something like that. Right?
It, it as a type describes what a object has in terms of method.
Mand Data types and the some types is something that are substitutable in terms
of those methods and their types and these are actually separate concepts you could
have two types that are now oriented to classes or two classes that are not
related types but for good reasons of convenience.
Most statically typed class based object-oriented languages choose to
confuse these ideas by reusing class names as type names and then that type
represented by some class name is the type that, well, go find all the methods in the
class definition, read out their argument types and the result types and, and that's
the type we need. And this is very convenient in practice
but it's a matter of terminology classes are about behavior and types are about
interfaces, about what you can take and what you return.
Couple optional things that are relevant to this segment but they're more details
and their not things that you need to learn.
This is not very interesting but Java and C# are perfectly sound.
Right? The way their sub-typing and sub-classing
work, as long as you don't have explicit downcast which I'm not getting into here.
All work. Whenever you call a method m on some
expression, if that expression is not null then the receiving object actually has a
method m. It gets it right.
But various times when you're programming these languages you run into details where
you really think that you got it wrong. Here is one that I stumbled across a
couple of years ago and it's just because I did not know Java as well as I thought I
did. Let's say that superclass declares some
field, all right, it has type, some type. So you would think that if the sub-class
is going have that field, it would have to have the same type, because fields are
immutable, we know from dev sub-typing that you can't cant change it.
Well, it turns out, in Java, the superclass and the subclass both define a
field with the same name say, foo. Then you can do that.
And the 2 declarations of the field can have totally different and unrelated
types. And it all goes through the type checker.
And everything seems to work fine. And I was really confused about this and
it turns what happens, Java decided that if you do that you just have 2 fields
named foo in the same object and the only question now is when you say foo, Which
field are you talking about? Right?
And what they decide is that, in the superclass.
You're talking about the one defined in the superclass.
And in the subclass, you're taking about the 1 defined in the subclass.
And there may be various ways to get at the other one if you really need to.
I'm not sure that there is, actually. But that resolved my confusion.
So I though that they were doing something unsound, but when your confused you don't
just keep trying things out, you go and get the language manual and you read about
what they're actually doing and, you know, the people who designed Java are fairly
smart and they would not do something that would break the soundness of their type
system even if occassionally. You used some feature you didn't realize
you were using, and you find it a little confusing.
That was optional topic number 1. This one, if you're still with me, you may
find more interesting, which is that "self", Ruby "self", or what Java and C#
and C++ call "this" is actually special. Yet, if you think of it as an argument,
and you very well might. Because when, back when we encoded OOP and
racquet remember how our functions took an extra argument that was self?
If you think of self as an argument to methods.
Then it's a very special argument, because it gets to be treated covariantly, even
though all the arguments like I spent the previous segment on, have to be
contravariant. So, there's an example here.
Suppose you have some class A with a method M Okay.
If class B, which is a subclass of A, overrides M.
In M, in the subclass, we get to know that self is a B.
We get to know down here in the bottom M that there's a field X and we can use it.
We can even return it. Okay.
That only works if self is covariant, that somehow self, if you think of it as an
argument, and the subclass gets to be a B even though in the superclass, we only
know it's an A. And that's actually covariance.
It's not contravariance, so is this unsound?
And it turns out it is not unsound because it's not a normal argument.
It's not something where callers of m can choose what object to pass.
They have to pass self. It has to be the entire object.
So whenever this bottom m is evaluated, we know that self will be an instance of b,
and so it's fine to assume that. Even though in the super class.
We did not know that. We only know that, in the sub-class.
And so that is our study in taking everything we learned about sub-typing in
a more careful and precise environment. And seeing how it actually explains a lot
of how type systems in object oriented languages.