diff --git a/diff-en/2ech6-3ech6.diff b/diff-en/2ech6-3ech6.diff new file mode 100644 index 0000000..f362058 --- /dev/null +++ b/diff-en/2ech6-3ech6.diff @@ -0,0 +1,1090 @@ +diff --git a/2ech6.md b/3ech6.md +index cc81bf0..9f69b59 100644 +--- a/2ech6.md ++++ b/3ech6.md +@@ -1,90 +1,96 @@ + # Chapter 6The Secret Life of Objects + +-> The problem with object-oriented languages is they've got all this implicit environment that they carry around with them. You wanted a banana but what you got was a gorilla holding the banana and the entire jungle. ++> An abstract data type is realized by writing a special kind of program […] which defines the type in terms of the operations which can be performed on it. + > +-> <footer>Joe Armstrong, <cite>interviewed in Coders at Work</cite></footer> ++> <footer>Barbara Liskov, <cite>Programming with Abstract Data Types</cite></footer> + +-When a programmer says “object”, this is a loaded term. In my profession, objects are a way of life, the subject of holy wars, and a beloved buzzword that still hasn't quite lost its power. ++[Chapter 4](04_data.html) introduced JavaScript's objects. In programming culture, we have a thing called _object-oriented programming_, a set of techniques that use objects (and related concepts) as the central principle of program organization. + +-To an outsider, this is probably a little confusing. Let's start with a brief history of objects as a programming construct. ++Though no one really agrees on its precise definition, object-oriented programming has shaped the design of many programming languages, including JavaScript. This chapter will describe the way these ideas can be applied in JavaScript. + +-## History ++## Encapsulation + +-This story, like most programming stories, starts with the problem of complexity. One philosophy is that complexity can be made manageable by separating it into small compartments that are isolated from each other. These compartments have ended up with the name _objects_. ++The core idea in object-oriented programming is to divide programs into smaller pieces and make each piece responsible for managing its own state. + +-An object is a hard shell that hides the gooey complexity inside it and instead offers us a few knobs and connectors (such as methods) that present an _interface_ through which the object is to be used. The idea is that the interface is relatively simple and all the complex things going on _inside_ the object can be ignored when working with it. ++This way, some knowledge about the way a piece of the program works can be kept _local_ to that piece. Someone working on the rest of the program does not have to remember or even be aware of that knowledge. Whenever these local details change, only the code directly around it needs to be updated. + +-![A simple interface can hide a lot of complexity.](img/object.jpg) ++Different pieces of such a program interact with each other through _interfaces_, limited sets of functions or bindings that provide useful functionality at a more abstract level, hiding its precise implementation. + +-As an example, you can imagine an object that provides an interface to an area on your screen. It provides a way to draw shapes or text onto this area but hides all the details of how these shapes are converted to the actual pixels that make up the screen. You'd have a set of methods—for example, `drawCircle`—and those are the only things you need to know in order to use such an object. ++Such program pieces are modeled using objects. Their interface consists of a specific set of methods and properties. Properties that are part of the interface are called _public_. The others, which outside code should not be touching, are called _private_. + +-These ideas were initially worked out in the 1970s and 1980s and, in the 1990s, were carried up by a huge wave of hype—the object-oriented programming revolution. Suddenly, there was a large tribe of people declaring that objects were the _right_ way to program—and that anything that did not involve objects was outdated nonsense. ++Many languages provide a way to distinguish public and private properties and will prevent outside code from accessing the private ones altogether. JavaScript, once again taking the minimalist approach, does not. Not yet, at least—there is work underway to add this to the language. + +-That kind of zealotry always produces a lot of impractical silliness, and there has been a sort of counter-revolution since then. In some circles, objects have a rather bad reputation nowadays. ++Even though the language doesn't have this distinction built in, JavaScript programmers _are_ successfully using this idea. Typically, the available interface is described in documentation or comments. It is also common to put an underscore (`_`) character at the start of property names to indicate that those properties are private. + +-I prefer to look at the issue from a practical, rather than ideological, angle. There are several useful concepts, most importantly that of _encapsulation_ (distinguishing between internal complexity and external interface), that the object-oriented culture has popularized. These are worth studying. +- +-This chapter describes JavaScript's rather eccentric take on objects and the way they relate to some classical object-oriented techniques. ++Separating interface from implementation is a great idea. It is usually called _encapsulation_. + + ## Methods + +-Methods are simply properties that hold function values. This is a simple method: ++Methods are nothing more than properties that hold function values. This is a simple method: + + ``` +-var rabbit = {}; ++let rabbit = {}; + rabbit.speak = function(line) { +- console.log("The rabbit says '" + line + "'"); ++ console.log(`The rabbit says '${line}'`); + }; + + rabbit.speak("I'm alive."); + // → The rabbit says 'I'm alive.' + ``` + +-Usually a method needs to do something with the object it was called on. When a function is called as a method—looked up as a property and immediately called, as in `object.method()`—the special variable `this` in its body will point to the object that it was called on. ++Usually a method needs to do something with the object it was called on. When a function is called as a method—looked up as a property and immediately called, as in `object.method()`—the binding called `this` in its body automatically points at the object that it was called on. + + ``` + function speak(line) { +- console.log("The " + this.type + " rabbit says '" + +- line + "'"); ++ console.log(`The ${this.type} rabbit says '${line}'`); + } +-var whiteRabbit = {type: "white", speak: speak}; +-var fatRabbit = {type: "fat", speak: speak}; ++let whiteRabbit = {type: "white", speak}; ++let hungryRabbit = {type: "hungry", speak}; + + whiteRabbit.speak("Oh my ears and whiskers, " + + "how late it's getting!"); + // → The white rabbit says 'Oh my ears and whiskers, how + // late it's getting!' +-fatRabbit.speak("I could sure use a carrot right now."); +-// → The fat rabbit says 'I could sure use a carrot +-// right now.' ++hungryRabbit.speak("I could use a carrot right now."); ++// → The hungry rabbit says 'I could use a carrot right now.' ++``` ++ ++You can think of `this` as an extra parameter that is passed in a different way. If you want to pass it explicitly, you can use a function's `call` method, which takes the `this` value as first argument and treats further arguments as normal parameters. ++ ++``` ++speak.call(hungryRabbit, "Burp!"); ++// → The hungry rabbit says 'Burp!' + ``` + +-The code uses the `this` keyword to output the type of rabbit that is speaking. Recall that the `apply` and `bind` methods both take a first argument that can be used to simulate method calls. This first argument is in fact used to give a value to `this`. ++Since each function has its own `this` binding, whose value depends on the way it is called, you cannot refer to the `this` of the wrapping scope in a regular function defined with the `function` keyword. + +-There is a method similar to `apply`, called `call`. It also calls the function it is a method of but takes its arguments normally, rather than as an array. Like `apply` and `bind`, `call` can be passed a specific `this` value. ++Arrow functions are different—they do not bind their own `this`, but can see the `this` binding of the scope around them. Thus, you can do something like the following code, which references `this` from inside a local function: + + ``` +-speak.apply(fatRabbit, ["Burp!"]); +-// → The fat rabbit says 'Burp!' +-speak.call({type: "old"}, "Oh my."); +-// → The old rabbit says 'Oh my.' ++function normalize() { ++ console.log(this.coords.map(n => n / this.length)); ++} ++normalize.call({coords: [0, 2, 3], length: 5}); ++// → [0, 0.4, 0.6] + ``` + ++If I had written the argument to `map` using the `function` keyword, the code wouldn't work. ++ + ## Prototypes + + Watch closely. + + ``` +-var empty = {}; ++let empty = {}; + console.log(empty.toString); + // → function toString(){…} + console.log(empty.toString()); + // → [object Object] + ``` + +-I just pulled a property out of an empty object. Magic! ++I pulled a property out of an empty object. Magic! + +-Well, not really. I have simply been withholding information about the way JavaScript objects work. In addition to their set of properties, almost all objects also have a _prototype_. A prototype is another object that is used as a fallback source of properties. When an object gets a request for a property that it does not have, its prototype will be searched for the property, then the prototype's prototype, and so on. ++Well, not really. I have simply been withholding information about the way JavaScript objects work. In addition to their set of properties, most objects also have a _prototype_. A prototype is another object that is used as a fallback source of properties. When an object gets a request for a property that it does not have, its prototype will be searched for the property, then the prototype's prototype, and so on. + + So who is the prototype of that empty object? It is the great ancestral prototype, the entity behind almost all objects, `Object.prototype`. + +@@ -96,14 +102,14 @@ console.log(Object.getPrototypeOf(Object.prototype)); + // → null + ``` + +-As you might expect, the `Object.getPrototypeOf` function returns the prototype of an object. ++As you guess, `Object.<wbr>getPrototypeOf` returns the prototype of an object. + + The prototype relations of JavaScript objects form a tree-shaped structure, and at the root of this structure sits `Object.prototype`. It provides a few methods that show up in all objects, such as `toString`, which converts an object to a string representation. + +-Many objects don't directly have `Object.prototype` as their prototype, but instead have another object, which provides its own default properties. Functions derive from `Function.prototype`, and arrays derive from `Array.prototype`. ++Many objects don't directly have `Object.prototype` as their prototype, but instead have another object that provides a different set of default properties. Functions derive from `Function.<wbr>prototype`, and arrays derive from `Array.prototype`. + + ``` +-console.log(Object.getPrototypeOf(isNaN) == ++console.log(Object.getPrototypeOf(Math.max) == + Function.prototype); + // → true + console.log(Object.getPrototypeOf([]) == +@@ -113,58 +119,103 @@ console.log(Object.getPrototypeOf([]) == + + Such a prototype object will itself have a prototype, often `Object.prototype`, so that it still indirectly provides methods like `toString`. + +-The `Object.getPrototypeOf` function obviously returns the prototype of an object. You can use `Object.create` to create an object with a specific prototype. ++You can use `Object.create` to create an object with a specific prototype. + + ``` +-var protoRabbit = { +- speak: function(line) { +- console.log("The " + this.type + " rabbit says '" + +- line + "'"); ++let protoRabbit = { ++ speak(line) { ++ console.log(`The ${this.type} rabbit says '${line}'`); + } + }; +-var killerRabbit = Object.create(protoRabbit); ++let killerRabbit = Object.create(protoRabbit); + killerRabbit.type = "killer"; + killerRabbit.speak("SKREEEE!"); + // → The killer rabbit says 'SKREEEE!' + ``` + ++A property like `speak(line)` in an object expression is a shorthand for defining a method. It creates a property called `speak` and gives it a function as its value. ++ + The “proto” rabbit acts as a container for the properties that are shared by all rabbits. An individual rabbit object, like the killer rabbit, contains properties that apply only to itself—in this case its type—and derives shared properties from its prototype. + +-## Constructors ++## Classes + +-A more convenient way to create objects that derive from some shared prototype is to use a _constructor_. In JavaScript, calling a function with the `new` keyword in front of it causes it to be treated as a constructor. The constructor will have its `this` variable bound to a fresh object, and unless it explicitly returns another object value, this new object will be returned from the call. ++JavaScript's prototype system can be interpreted as a somewhat informal take on an object-oriented concept called _classes_. A class defines the shape of a type of object—what methods and properties it has. Such an object is called an _instance_ of the class. + +-An object created with `new` is said to be an _instance_ of its constructor. ++Prototypes are useful for defining properties for which all instances of a class share the same value, such as methods. Properties that differ per instance, such as our rabbits' `type` property, need to be stored directly in the objects themselves. + +-Here is a simple constructor for rabbits. It is a convention to capitalize the names of constructors so that they are easily distinguished from other functions. ++So in order to create an instance of a given class, you have to make an object that derives from the proper prototype, but you _also_ have to make sure it, itself, has the properties that instances of this class are supposed to have. This is what a _constructor_ function does. ++ ++``` ++function makeRabbit(type) { ++ let rabbit = Object.create(protoRabbit); ++ rabbit.type = type; ++ return rabbit; ++} ++``` ++ ++JavaScript provides a way to make defining this type of function easier. If you put the keyword `new` in front of a function call, the function is treated as a constructor. This means that an object with the right prototype is automatically created, bound to `this` in the function, and returned at the end of the function. ++ ++The prototype object used when constructing objects is found by taking the `prototype` property of the constructor function. + + ``` + function Rabbit(type) { + this.type = type; + } ++Rabbit.prototype.speak = function(line) { ++ console.log(`The ${this.type} rabbit says '${line}'`); ++}; + +-var killerRabbit = new Rabbit("killer"); +-var blackRabbit = new Rabbit("black"); +-console.log(blackRabbit.type); +-// → black ++let weirdRabbit = new Rabbit("weird"); + ``` + +-Constructors (in fact, all functions) automatically get a property named `prototype`, which by default holds a plain, empty object that derives from `Object.prototype`. Every instance created with this constructor will have this object as its prototype. So to add a `speak` method to rabbits created with the `Rabbit` constructor, we can simply do this: ++Constructors (all functions, in fact) automatically get a property named `prototype`, which by default holds a plain, empty object that derives from `Object.prototype`. You can overwrite it with a new object if you want. Or you can add properties to the existing object, as the example does. ++ ++By convention, the names of constructors are capitalized so that they can easily be distinguished from other functions. ++ ++It is important to understand the distinction between the way a prototype is associated with a constructor (through its `prototype` property) and the way objects _have_ a prototype (which can be found with `Object.<wbr>getPrototypeOf`). The actual prototype of a constructor is `Function.<wbr>prototype`, since constructors are functions. Its `prototype` _property_ holds the prototype used for instances created through it. + + ``` +-Rabbit.prototype.speak = function(line) { +- console.log("The " + this.type + " rabbit says '" + +- line + "'"); +-}; +-blackRabbit.speak("Doom..."); +-// → The black rabbit says 'Doom...' ++console.log(Object.getPrototypeOf(Rabbit) == ++ Function.prototype); ++// → true ++console.log(Object.getPrototypeOf(weirdRabbit) == ++ Rabbit.prototype); ++// → true ++``` ++ ++## Class notation ++ ++So JavaScript classes are constructor functions with a prototype property. That is how they work, and until 2015, that was how you had to write them. These days, we have a less awkward notation. ++ ++``` ++class Rabbit { ++ constructor(type) { ++ this.type = type; ++ } ++ speak(line) { ++ console.log(`The ${this.type} rabbit says '${line}'`); ++ } ++} ++ ++let killerRabbit = new Rabbit("killer"); ++let blackRabbit = new Rabbit("black"); + ``` + +-It is important to note the distinction between the way a prototype is associated with a constructor (through its `prototype` property) and the way objects _have_ a prototype (which can be retrieved with `Object.getPrototypeOf`). The actual prototype of a constructor is `Function.prototype` since constructors are functions. Its `prototype` _property_ will be the prototype of instances created through it but is not its _own_ prototype. ++The `class` keyword starts a class declaration, which allows us to define a constructor and a set of methods all in a single place. Any number of methods may be written inside the declaration's curly braces. The one named `constructor` is treated specially. It provides the actual constructor function, which will be bound to the name `Rabbit`. The others are packaged into that constructor's prototype. Thus, the class declaration above is equivalent to the constructor definition from the previous section. It just looks nicer. ++ ++Class declarations currently only allow _methods_—properties that hold functions—to be added to the prototype. This can be somewhat inconvenient when you want to save a non-function value in there. The next version of the language will probably improve this. For now, you can create such properties by directly manipulating the prototype after you've defined the class. ++ ++Like `function`, `class` can be used both in statement and in expression positions. When used as an expression, it doesn't define a binding, but just produces the constructor as a value. You are allowed to omit the class name in a class expression. ++ ++``` ++let object = new class { getWord() { return "hello"; } }; ++console.log(object.getWord()); ++// → hello ++``` + + ## Overriding derived properties + +-When you add a property to an object, whether it is present in the prototype or not, the property is added to the object _itself_, which will henceforth have it as its own property. If there _is_ a property by the same name in the prototype, this property will no longer affect the object. The prototype itself is not changed. ++When you add a property to an object, whether it is present in the prototype or not, the property is added to the object _itself_. If there was already a property with the same name in the prototype, this property will no longer affect the object, as it is now hidden behind the object's own property. + + ``` + Rabbit.prototype.teeth = "small"; +@@ -181,11 +232,11 @@ console.log(Rabbit.prototype.teeth); + + The following diagram sketches the situation after this code has run. The `Rabbit` and `Object` prototypes lie behind `killerRabbit` as a kind of backdrop, where properties that are not found in the object itself can be looked up. + +-![Rabbit object prototype schema](img/rabbits.svg) ++
![Rabbit object prototype schema](img/rabbits.svg)
+ +-Overriding properties that exist in a prototype is often a useful thing to do. As the rabbit teeth example shows, it can be used to express exceptional properties in instances of a more generic class of objects, while letting the nonexceptional objects simply take a standard value from their prototype. ++Overriding properties that exist in a prototype can be a useful thing to do. As the rabbit teeth example shows, it can be used to express exceptional properties in instances of a more generic class of objects, while letting the nonexceptional objects take a standard value from their prototype. + +-It is also used to give the standard function and array prototypes a different `toString` method than the basic object prototype. ++Overriding is also used to give the standard function and array prototypes a different `toString` method than the basic object prototype. + + ``` + console.log(Array.prototype.toString == +@@ -195,512 +246,456 @@ console.log([1, 2].toString()); + // → 1,2 + ``` + +-Calling `toString` on an array gives a result similar to calling `.join(",")` on it—it puts commas between the values in the array. Directly calling `Object.prototype.toString` with an array produces a different string. That function doesn't know about arrays, so it simply puts the word “object” and the name of the type between square brackets. ++Calling `toString` on an array gives a result similar to calling `.<wbr>join(",")` on it—it puts commas between the values in the array. Directly calling `Object.<wbr>prototype.<wbr>toString` with an array produces a different string. That function doesn't know about arrays, so it simply puts the word _object_ and the name of the type between square brackets. + + ``` + console.log(Object.prototype.toString.call([1, 2])); + // → [object Array] + ``` + +-## Prototype interference ++## Maps + +-A prototype can be used at any time to add new properties and methods to all objects based on it. For example, it might become necessary for our rabbits to dance. ++We saw the word _map_ used in the [previous chapter](05_higher_order.html#map) for an operation that transforms a data structure by applying a function its elements. Confusing as it is, in programming the same word is also used for a related but rather different thing. ++ ++A _map_ (noun) is a data structure that associates values (the keys) with other values. For example, you might want to map names to ages. It is possible to use objects for this. + + ``` +-Rabbit.prototype.dance = function() { +- console.log("The " + this.type + " rabbit dances a jig."); ++let ages = { ++ Boris: 39, ++ Liang: 22, ++ Júlia: 62 + }; +-killerRabbit.dance(); +-// → The killer rabbit dances a jig. +-``` +- +-That's convenient. But there are situations where it causes problems. In previous chapters, we used an object as a way to associate values with names by creating properties for the names and giving them the corresponding value as their value. Here's an example from [Chapter 4](04_data.html#object_map): + ++console.log(`Júlia is ${ages["Júlia"]}`); ++// → Júlia is 62 ++console.log("Is Jack's age known?", "Jack" in ages); ++// → Is Jack's age known? false ++console.log("Is toString's age known?", "toString" in ages); ++// → Is toString's age known? true + ``` +-var map = {}; +-function storePhi(event, phi) { +- map[event] = phi; +-} + +-storePhi("pizza", 0.069); +-storePhi("touched tree", -0.081); +-``` ++Here, the object's property names are the people's names, and the property values their ages. But we certainly didn't list anybody named toString in our map. Yet, because plain objects derive from `Object.prototype`, it looks like the property is there. + +-We can iterate over all phi values in the object using a `for`/`in` loop and test whether a name is in there using the regular `in` operator. But unfortunately, the object's prototype gets in the way. ++As such, using plain objects as maps is dangerous. There are several possible ways to avoid this problem. First, it is possible to create objects with _no_ prototype. If you pass `null` to `Object.create`, the resulting object will not derive from `Object.prototype` and can safely be used as a map. + + ``` +-Object.prototype.nonsense = "hi"; +-for (var name in map) +- console.log(name); +-// → pizza +-// → touched tree +-// → nonsense +-console.log("nonsense" in map); +-// → true +-console.log("toString" in map); +-// → true +- +-// Delete the problematic property again +-delete Object.prototype.nonsense; ++console.log("toString" in Object.create(null)); ++// → false + ``` + +-That's all wrong. There is no event called “nonsense” in our data set. And there _definitely_ is no event called “toString”. +- +-Oddly, `toString` did not show up in the `for`/`in` loop, but the `in` operator did return true for it. This is because JavaScript distinguishes between _enumerable_ and _nonenumerable_ properties. ++Object property names must be strings. If you need a map whose keys can't easily be converted to strings—such as objects—you cannot use an object as your map. + +-All properties that we create by simply assigning to them are enumerable. The standard properties in `Object.prototype` are all nonenumerable, which is why they do not show up in such a `for`/`in` loop. ++Fortunately, JavaScript comes with a class called `Map` that is written for this exact purpose. It stores a mapping and allows any type of keys. + +-It is possible to define our own nonenumerable properties by using the `Object.defineProperty` function, which allows us to control the type of property we are creating. +- +-``` +-Object.defineProperty(Object.prototype, "hiddenNonsense", +- {enumerable: false, value: "hi"}); +-for (var name in map) +- console.log(name); +-// → pizza +-// → touched tree +-console.log(map.hiddenNonsense); +-// → hi + ``` ++let ages = new Map(); ++ages.set("Boris", 39); ++ages.set("Liang", 22); ++ages.set("Júlia", 62); + +-So now the property is there, but it won't show up in a loop. That's good. But we still have the problem with the regular `in` operator claiming that the `Object.prototype` properties exist in our object. For that, we can use the object's `hasOwnProperty` method. +- +-``` +-console.log(map.hasOwnProperty("toString")); ++console.log(`Júlia is ${ages.get("Júlia")}`); ++// → Júlia is 62 ++console.log("Is Jack's age known?", ages.has("Jack")); ++// → Is Jack's age known? false ++console.log(ages.has("toString")); + // → false + ``` + +-This method tells us whether the object _itself_ has the property, without looking at its prototypes. This is often a more useful piece of information than what the `in` operator gives us. ++The methods `set`, `get`, and `has` are part of the interface of the `Map` object. Writing a data structure that can quickly update and search a large set of values isn't easy, but we don't have to worry about that. Someone else did it for us, and we can go through this simple interface to use their work. + +-When you are worried that someone (some other code you loaded into your program) might have messed with the base object prototype, I recommend you write your `for`/`in` loops like this: ++If you do have a plain object that you need to treat as a map for some reason, it is useful to know that `Object.keys` only returns an object's _own_ keys, not those in the prototype. As an alternative to the `in` operator, you can use the `hasOwnProperty` method, which ignores the object's prototype. + + ``` +-for (var name in map) { +- if (map.hasOwnProperty(name)) { +- // ... this is an own property +- } +-} ++console.log({x: 1}.hasOwnProperty("x")); ++// → true ++console.log({x: 1}.hasOwnProperty("toString")); ++// → false + ``` + +-## Prototype-less objects +- +-But the rabbit hole doesn't end there. What if someone registered the name `hasOwnProperty` in our `map` object and set it to the value 42? Now the call to `map.hasOwnProperty` will try to call the local property, which holds a number, not a function. ++## Polymorphism + +-In such a case, prototypes just get in the way, and we would actually prefer to have objects without prototypes. We saw the `Object.create` function, which allows us to create an object with a specific prototype. You are allowed to pass `null` as the prototype to create a fresh object with no prototype. For objects like `map`, where the properties could be anything, this is exactly what we want. ++When you call the `String` function (which converts a value to a string) on an object, it will call the `toString` method on that object to try to create a meaningful string from it. I mentioned that some of the standard prototypes define their own version of `toString` so they can create a string that contains more useful information than `"[object Object]"`. You can also do that yourself. + + ``` +-var map = Object.create(null); +-map["pizza"] = 0.069; +-console.log("toString" in map); +-// → false +-console.log("pizza" in map); +-// → true ++Rabbit.prototype.toString = function() { ++ return `a ${this.type} rabbit`; ++}; ++ ++console.log(String(blackRabbit)); ++// → a black rabbit + ``` + +-Much better! We no longer need the `hasOwnProperty` kludge because all the properties the object has are its own properties. Now we can safely use `for`/`in` loops, no matter what people have been doing to `Object.prototype`. ++This is a simple instance of a powerful idea. When a piece of code is written to work with objects that have a certain interface—in this case, a `toString` method—any kind of object that happens to support this interface can be plugged into the code, and it will just work. + +-## Polymorphism ++This technique is called _polymorphism_. Polymorphic code can work with values of different shapes, as long as they support the interface it expects. + +-When you call the `String` function, which converts a value to a string, on an object, it will call the `toString` method on that object to try to create a meaningful string to return. I mentioned that some of the standard prototypes define their own version of `toString` so they can create a string that contains more useful information than `"[object Object]"`. ++I mentioned in [Chapter 4](04_data.html#for_of_loop) that a `for`/`of` loop can loop over several kinds of data structures. This is another case of polymorphism—such loops expect the data structure to expose a specific interface, which arrays and strings do. And you can also add this interface to your own objects! But before we can do that, we need to know what symbols are. + +-This is a simple instance of a powerful idea. When a piece of code is written to work with objects that have a certain interface—in this case, a `toString` method—any kind of object that happens to support this interface can be plugged into the code, and it will just work. ++## Symbols + +-This technique is called _polymorphism_—though no actual shape-shifting is involved. Polymorphic code can work with values of different shapes, as long as they support the interface it expects. ++It is possible for multiple interfaces to use the same property name for different things. For example, I could define an interface in which the `toString` method is supposed to convert the object into a piece of yarn. It would not be possible for an object to conform to both that interface and the standard use of `toString`. + +-## Laying out a table ++That would be a bad idea, and this problem isn't that common. Most JavaScript programmers simply don't think about it. But the language designers, whose _job_ it is to think about this stuff, have provided us with a solution anyway. + +-I am going to work through a slightly more involved example in an attempt to give you a better idea what polymorphism, as well as object-oriented programming in general, looks like. The project is this: we will write a program that, given an array of arrays of table cells, builds up a string that contains a nicely laid out table—meaning that the columns are straight and the rows are aligned. Something like this: ++When I claimed that property names are strings, that wasn't entirely accurate. They usually are, but they can also be _symbols_. Symbols are values created with the `Symbol` function. Unlike strings, newly created symbols are unique—you cannot create the same symbol twice. + + ``` +-name height country +------------- ------ ------------- +-Kilimanjaro 5895 Tanzania +-Everest 8848 Nepal +-Mount Fuji 3776 Japan +-Mont Blanc 4808 Italy/France +-Vaalserberg 323 Netherlands +-Denali 6168 United States +-Popocatepetl 5465 Mexico ++let sym = Symbol("name"); ++console.log(sym == Symbol("name")); ++// → false ++Rabbit.prototype[sym] = 55; ++console.log(blackRabbit[sym]); ++// → 55 + ``` + +-The way our table-building system will work is that the builder function will ask each cell how wide and high it wants to be and then use this information to determine the width of the columns and the height of the rows. The builder function will then ask the cells to draw themselves at the correct size and assemble the results into a single string. ++The string you pass to `Symbol` is included when you convert it to a string, and can make it easier to recognize a symbol when, for example, showing it in the console. But it has no meaning beyond that—multiple symbols may have the same name. + +-The layout program will communicate with the cell objects through a well-defined interface. That way, the types of cells that the program supports is not fixed in advance. We can add new cell styles later—for example, underlined cells for table headers—and if they support our interface, they will just work, without requiring changes to the layout program. ++Being both unique and useable as property names makes symbols suitable for defining interfaces that can peacefully live alongside other properties, no matter what their names are. + +-This is the interface: +- +-* `minHeight()` returns a number indicating the minimum height this cell requires (in lines). ++``` ++const toStringSymbol = Symbol("toString"); ++Array.prototype[toStringSymbol] = function() { ++ return `${this.length} cm of blue yarn`; ++}; + +-* `minWidth()` returns a number indicating this cell's minimum width (in characters). ++console.log([1, 2].toString()); ++// → 1,2 ++console.log([1, 2][toStringSymbol]()); ++// → 2 cm of blue yarn ++``` + +-* `draw(width, height)` returns an array of length `height`, which contains a series of strings that are each `width` characters wide. This represents the content of the cell. ++It is possible to include symbol properties in object expressions and classes by using square brackets around the property name. That causes the property name to be evaluated, much like the square bracket property access notation, which allows us to refer to a binding that holds the symbol. + +-I'm going to make heavy use of higher-order array methods in this example since it lends itself well to that approach. ++``` ++let stringObject = { ++ [toStringSymbol]() { return "a jute rope"; } ++}; ++console.log(stringObject[toStringSymbol]()); ++// → a jute rope ++``` + +-The first part of the program computes arrays of minimum column widths and row heights for a grid of cells. The `rows` variable will hold an array of arrays, with each inner array representing a row of cells. ++## The iterator interface + +-``` +-function rowHeights(rows) { +- return rows.map(function(row) { +- return row.reduce(function(max, cell) { +- return Math.max(max, cell.minHeight()); +- }, 0); +- }); +-} ++The object given to a `for`/`of` loop is expected to be _iterable_. This means that it has a method named with the `Symbol.iterator` symbol (a symbol value defined by the language, stored as a property of the `Symbol` function). + +-function colWidths(rows) { +- return rows[0].map(function(_, i) { +- return rows.reduce(function(max, row) { +- return Math.max(max, row[i].minWidth()); +- }, 0); +- }); +-} +-``` ++When called, that method should return an object that provides a second interface, _iterator_. This is the actual thing that iterates. It has a `next` method that returns the next result. That result should be an object with a `value` property, providing the next value, if there is one, and a `done` property which should be true when there are no more results and false otherwise. + +-Using a variable name starting with an underscore (_) or consisting entirely of a single underscore is a way to indicate (to human readers) that this argument is not going to be used. ++Note that the `next`, `value`, and `done` property names are plain strings, not symbols. Only `Symbol.iterator`, which is likely to be added to a _lot_ of different objects, is an actual symbol. + +-The `rowHeights` function shouldn't be too hard to follow. It uses `reduce` to compute the maximum height of an array of cells and wraps that in `map` in order to do it for all rows in the `rows` array. ++We can directly use this interface ourselves. + +-Things are slightly harder for the `colWidths` function because the outer array is an array of rows, not of columns. I have failed to mention so far that `map` (as well as `forEach`, `filter`, and similar array methods) passes a second argument to the function it is given: the index of the current element. By mapping over the elements of the first row and only using the mapping function's second argument, `colWidths` builds up an array with one element for every column index. The call to `reduce` runs over the outer `rows` array for each index and picks out the width of the widest cell at that index. ++``` ++let okIterator = "OK"[Symbol.iterator](); ++console.log(okIterator.next()); ++// → {value: "O", done: false} ++console.log(okIterator.next()); ++// → {value: "K", done: false} ++console.log(okIterator.next()); ++// → {value: undefined, done: true} ++``` + +-Here's the code to draw a table: ++Let's implement an iterable data structure. We'll build a _matrix_ class, acting as a two-dimensional array. + + ``` +-function drawTable(rows) { +- var heights = rowHeights(rows); +- var widths = colWidths(rows); ++class Matrix { ++ constructor(width, height, element = (x, y) => undefined) { ++ this.width = width; ++ this.height = height; ++ this.content = []; + +- function drawLine(blocks, lineNo) { +- return blocks.map(function(block) { +- return block[lineNo]; +- }).join(" "); ++ for (let y = 0; y < height; y++) { ++ for (let x = 0; x < width; x++) { ++ this.content[y * width + x] = element(x, y); ++ } ++ } + } + +- function drawRow(row, rowNum) { +- var blocks = row.map(function(cell, colNum) { +- return cell.draw(widths[colNum], heights[rowNum]); +- }); +- return blocks[0].map(function(_, lineNo) { +- return drawLine(blocks, lineNo); +- }).join("\n"); ++ get(x, y) { ++ return this.content[y * this.width + x]; ++ } ++ set(x, y, value) { ++ this.content[y * this.width + x] = value; + } +- +- return rows.map(drawRow).join("\n"); + } + ``` + +-The `drawTable` function uses the internal helper function `drawRow` to draw all rows and then joins them together with newline characters. +- +-The `drawRow` function itself first converts the cell objects in the row to _blocks_, which are arrays of strings representing the content of the cells, split by line. A single cell containing simply the number 3776 might be represented by a single-element array like `["3776"]`, whereas an underlined cell might take up two lines and be represented by the array `["name", "----"]`. ++The class stores its content in a single array of _width_ × _height_ elements. The elements are stored row-by-row, so, for example, the third element in the fifth row is (using zero-based indexing) stored at position 4 × _width_ + 2. + +-The blocks for a row, which all have the same height, should appear next to each other in the final output. The second call to `map` in `drawRow` builds up this output line by line by mapping over the lines in the leftmost block and, for each of those, collecting a line that spans the full width of the table. These lines are then joined with newline characters to provide the whole row as `drawRow`'s return value. ++The constructor function takes a width, height, and an optional content function that will be used to fill in the initial values. There are `get` and `set` methods to retrieve and update elements in the matrix. + +-The function `drawLine` extracts lines that should appear next to each other from an array of blocks and joins them with a space character to create a one-character gap between the table's columns. +- +-Now let's write a constructor for cells that contain text, which implements the interface for table cells. The constructor splits a string into an array of lines using the string method `split`, which cuts up a string at every occurrence of its argument and returns an array of the pieces. The `minWidth` method finds the maximum line width in this array. ++When looping over a matrix, you are usually interested in the position of the elements as well as the elements themselves, so we'll have our iterator produce objects with `x`, `y`, and `value` properties. + + ``` +-function repeat(string, times) { +- var result = ""; +- for (var i = 0; i < times; i++) +- result += string; +- return result; +-} +- +-function TextCell(text) { +- this.text = text.split("\n"); +-} +-TextCell.prototype.minWidth = function() { +- return this.text.reduce(function(width, line) { +- return Math.max(width, line.length); +- }, 0); +-}; +-TextCell.prototype.minHeight = function() { +- return this.text.length; +-}; +-TextCell.prototype.draw = function(width, height) { +- var result = []; +- for (var i = 0; i < height; i++) { +- var line = this.text[i] || ""; +- result.push(line + repeat(" ", width - line.length)); ++class MatrixIterator { ++ constructor(matrix) { ++ this.x = 0; ++ this.y = 0; ++ this.matrix = matrix; + } +- return result; +-}; +-``` +- +-The code uses a helper function called `repeat`, which builds a string whose value is the `string` argument repeated `times` number of times. The `draw` method uses it to add “padding” to lines so that they all have the required length. + +-Let's try everything we've written so far by building up a 5 × 5 checkerboard. +- +-``` +-var rows = []; +-for (var i = 0; i < 5; i++) { +- var row = []; +- for (var j = 0; j < 5; j++) { +- if ((j + i) % 2 == 0) +- row.push(new TextCell("##")); +- else +- row.push(new TextCell(" ")); +- } +- rows.push(row); ++ next() { ++ if (this.y == this.matrix.height) return {done: true}; ++ ++ let value = {x: this.x, ++ y: this.y, ++ value: this.matrix.get(this.x, this.y)}; ++ this.x++; ++ if (this.x == this.matrix.width) { ++ this.x = 0; ++ this.y++; ++ } ++ return {value, done: false}; ++ } + } +-console.log(drawTable(rows)); +-// → ## ## ## +-// ## ## +-// ## ## ## +-// ## ## +-// ## ## ## + ``` + +-It works! But since all cells have the same size, the table-layout code doesn't really do anything interesting. ++The class tracks the progress of iterating over a matrix in its `x` and `y` properties. The `next` method starts by checking whether the bottom of the matrix has been reached. If it hasn't, it _first_ creates the object holding the current value and _then_ updates its position, moving to the next row if necessary. + +-The source data for the table of mountains that we are trying to build is available in the `MOUNTAINS` variable in the sandbox and also [downloadable](http://eloquentjavascript.net/2nd_edition/code/mountains.js) from the website. +- +-We will want to highlight the top row, which contains the column names, by underlining the cells with a series of dash characters. No problem—we simply write a cell type that handles underlining. ++Let us set up the `Matrix` class to be iterable. Throughout this book, I'll occasionally use after-the-fact prototype manipulation to add methods to classes, so that the individual pieces of code remain small and self-contained. In a regular program, where there is no need to split the code into small pieces, you'd declare these methods directly in the class instead. + + ``` +-function UnderlinedCell(inner) { +- this.inner = inner; +-} +-UnderlinedCell.prototype.minWidth = function() { +- return this.inner.minWidth(); +-}; +-UnderlinedCell.prototype.minHeight = function() { +- return this.inner.minHeight() + 1; +-}; +-UnderlinedCell.prototype.draw = function(width, height) { +- return this.inner.draw(width, height - 1) +- .concat([repeat("-", width)]); ++Matrix.prototype[Symbol.iterator] = function() { ++ return new MatrixIterator(this); + }; + ``` + +-An underlined cell _contains_ another cell. It reports its minimum size as being the same as that of its inner cell (by calling through to that cell's `minWidth` and `minHeight` methods) but adds one to the height to account for the space taken up by the underline. +- +-Drawing such a cell is quite simple—we take the content of the inner cell and concatenate a single line full of dashes to it. +- +-Having an underlining mechanism, we can now write a function that builds up a grid of cells from our data set. ++We can now loop over a matrix with `for`/`of`. + + ``` +-function dataTable(data) { +- var keys = Object.keys(data[0]); +- var headers = keys.map(function(name) { +- return new UnderlinedCell(new TextCell(name)); +- }); +- var body = data.map(function(row) { +- return keys.map(function(name) { +- return new TextCell(String(row[name])); +- }); +- }); +- return [headers].concat(body); ++let matrix = new Matrix(2, 2, (x, y) => `value ${x},${y}`); ++for (let {x, y, value} of matrix) { ++ console.log(x, y, value); + } +- +-console.log(drawTable(dataTable(MOUNTAINS))); +-// → name height country +-// ------------ ------ ------------- +-// Kilimanjaro 5895 Tanzania +-// … etcetera ++// → 0 0 value 0,0 ++// → 1 0 value 1,0 ++// → 0 1 value 0,1 ++// → 1 1 value 1,1 + ``` + +-The standard `Object.keys` function returns an array of property names in an object. The top row of the table must contain underlined cells that give the names of the columns. Below that, the values of all the objects in the data set appear as normal cells—we extract them by mapping over the `keys` array so that we are sure that the order of the cells is the same in every row. +- +-The resulting table resembles the example shown before, except that it does not right-align the numbers in the `height` column. We will get to that in a moment. +- +-## Getters and setters +- +-When specifying an interface, it is possible to include properties that are not methods. We could have defined `minHeight` and `minWidth` to simply hold numbers. But that'd have required us to compute them in the constructor, which adds code there that isn't strictly relevant to _constructing_ the object. It would cause problems if, for example, the inner cell of an underlined cell was changed, at which point the size of the underlined cell should also change. ++## Getters, setters, and statics + +-This has led some people to adopt a principle of never including nonmethod properties in interfaces. Rather than directly access a simple value property, they'd use `getSomething` and `setSomething` methods to read and write the property. This approach has the downside that you will end up writing—and reading—a lot of additional methods. ++Interfaces often consist mostly of methods, but it is also okay to include properties that hold non-function values. For example, `Map` objects have a `size` property that tells you how many keys are stored in them. + +-Fortunately, JavaScript provides a technique that gets us the best of both worlds. We can specify properties that, from the outside, look like normal properties but secretly have methods associated with them. ++It is not even necessary for such an object to compute and store such a property directly in the instance. Even properties that are accessed directly may hide a method call. Such methods are called _getters_, and they are defined by writing `get` in front of the method name in an object expression or class declaration. + + ``` +-var pile = { +- elements: ["eggshell", "orange peel", "worm"], +- get height() { +- return this.elements.length; +- }, +- set height(value) { +- console.log("Ignoring attempt to set height to", value); ++let varyingSize = { ++ get size() { ++ return Math.floor(Math.random() * 100); + } + }; + +-console.log(pile.height); +-// → 3 +-pile.height = 100; +-// → Ignoring attempt to set height to 100 ++console.log(varyingSize.size); ++// → 73 ++console.log(varyingSize.size); ++// → 49 + ``` + +-In an object literal, the `get` or `set` notation for properties allows you to specify a function to be run when the property is read or written. You can also add such a property to an existing object, for example a prototype, using the `Object.defineProperty` function (which we previously used to create nonenumerable properties). ++Whenever someone reads from this object's `size` property, the associated method is called. You can do a similar thing when a property is written to, using a _setter_. + + ``` +-Object.defineProperty(TextCell.prototype, "heightProp", { +- get: function() { return this.text.length; } +-}); ++class Temperature { ++ constructor(celsius) { ++ this.celsius = celsius; ++ } ++ get fahrenheit() { ++ return this.celsius * 1.8 + 32; ++ } ++ set fahrenheit(value) { ++ this.celsius = (value - 32) / 1.8; ++ } ++ ++ static fromFahrenheit(value) { ++ return new Temperature((value - 32) / 1.8); ++ } ++} + +-var cell = new TextCell("no\nway"); +-console.log(cell.heightProp); +-// → 2 +-cell.heightProp = 100; +-console.log(cell.heightProp); +-// → 2 ++let temp = new Temperature(22); ++console.log(temp.fahrenheit); ++// → 71.6 ++temp.fahrenheit = 86; ++console.log(temp.celsius); ++// → 30 + ``` + +-You can use a similar `set` property, in the object passed to `defineProperty`, to specify a setter method. When a getter but no setter is defined, writing to the property is simply ignored. ++The `Temperature` class allows you to read and write the temperature in either degrees Celsius or degrees Fahrenheit, but internally only stores Celsius, and automatically converts to Celsius in the `fahrenheit` getter and setter. + +-## Inheritance ++Sometimes you want to attach some properties directly to your constructor function, rather than to the prototype. Such methods won't have access to a class instance but can, for example, be used to provide additional ways to create instances. + +-We are not quite done yet with our table layout exercise. It helps readability to right-align columns of numbers. We should create another cell type that is like `TextCell`, but rather than padding the lines on the right side, it pads them on the left side so that they align to the right. ++Inside a class declaration, methods that have `static` written before their name are stored on the constructor. So the `Temperature` class allows you to write `Temperature.<wbr>fromFahrenheit(100)` to create a temperature using degrees Fahrenheit. + +-We could simply write a whole new constructor with all three methods in its prototype. But prototypes may themselves have prototypes, and this allows us to do something clever. ++## Inheritance + +-``` +-function RTextCell(text) { +- TextCell.call(this, text); +-} +-RTextCell.prototype = Object.create(TextCell.prototype); +-RTextCell.prototype.draw = function(width, height) { +- var result = []; +- for (var i = 0; i < height; i++) { +- var line = this.text[i] || ""; +- result.push(repeat(" ", width - line.length) + line); +- } +- return result; +-}; +-``` ++Some matrices are known to be _symmetric_. If you mirror a symmetric matrix around its top-left-to-bottom-right diagonal, it stays the same. In other words, the value stored at _x_,_y_ is always the same as that at _y_,_x_. + +-We reuse the constructor and the `minHeight` and `minWidth` methods from the regular `TextCell`. An `RTextCell` is now basically equivalent to a `TextCell`, except that its `draw` method contains a different function. ++Imagine we need a data structure like `Matrix` but one that enforces the fact that the matrix is and remains symmetrical. We could write it from scratch, but that would involve repeating some code very similar to what we already wrote. + +-This pattern is called _inheritance_. It allows us to build slightly different data types from existing data types with relatively little work. Typically, the new constructor will call the old constructor (using the `call` method in order to be able to give it the new object as its `this` value). Once this constructor has been called, we can assume that all the fields that the old object type is supposed to contain have been added. We arrange for the constructor's prototype to derive from the old prototype so that instances of this type will also have access to the properties in that prototype. Finally, we can override some of these properties by adding them to our new prototype. ++JavaScript's prototype system makes it possible to create a _new_ class, much like the old class, but with new definitions for some of its properties. The prototype for the new class derives from the old prototype but adds a new definition for, say, the `set` method. + +-Now, if we slightly adjust the `dataTable` function to use `RTextCell`s for cells whose value is a number, we get the table we were aiming for. ++In object-oriented programming terms, this is called _inheritance_. The new class inherits properties and behavior from the old class. + + ``` +-function dataTable(data) { +- var keys = Object.keys(data[0]); +- var headers = keys.map(function(name) { +- return new UnderlinedCell(new TextCell(name)); +- }); +- var body = data.map(function(row) { +- return keys.map(function(name) { +- var value = row[name]; +- // This was changed: +- if (typeof value == "number") +- return new RTextCell(String(value)); +- else +- return new TextCell(String(value)); ++class SymmetricMatrix extends Matrix { ++ constructor(size, element = (x, y) => undefined) { ++ super(size, size, (x, y) => { ++ if (x < y) return element(y, x); ++ else return element(x, y); + }); +- }); +- return [headers].concat(body); ++ } ++ ++ set(x, y, value) { ++ super.set(x, y, value); ++ if (x != y) { ++ super.set(y, x, value); ++ } ++ } + } + +-console.log(drawTable(dataTable(MOUNTAINS))); +-// → … beautifully aligned table ++let matrix = new SymmetricMatrix(5, (x, y) => `${x},${y}`); ++console.log(matrix.get(2, 3)); ++// → 3,2 + ``` + +-Inheritance is a fundamental part of the object-oriented tradition, alongside encapsulation and polymorphism. But while the latter two are now generally regarded as wonderful ideas, inheritance is somewhat controversial. ++The use of the word `extends` indicates that this class shouldn't be directly based on the default `Object` prototype, but on some other class. This is called the _superclass_. The derived class is the _subclass_. ++ ++To initialize a `SymmetricMatrix` instance, the constructor calls its superclass' constructor through the `super` keyword. This is necessary because if this new object is to behave (roughly) like a `Matrix`, it is going to need the instance properties that matrices have. In order to ensure the matrix is symmetrical, the constructor wraps the `content` method to swap the coordinates for values below the diagonal. + +-The main reason for this is that it is often confused with polymorphism, sold as a more powerful tool than it really is, and subsequently overused in all kinds of ugly ways. Whereas encapsulation and polymorphism can be used to _separate_ pieces of code from each other, reducing the tangledness of the overall program, inheritance fundamentally ties types together, creating _more_ tangle. ++The `set` method again uses `super`, but this time not to call the constructor, but to call a specific method from the superclass' set of methods. We are redefining `set` but do want to use the original behavior. Because `this.set` refers to the _new_ `set` method, calling that wouldn't work. Inside class methods, `super` provides a way to call methods as they were defined in the superclass. + +-You can have polymorphism without inheritance, as we saw. I am not going to tell you to avoid inheritance entirely—I use it regularly in my own programs. But you should see it as a slightly dodgy trick that can help you define new types with little code, not as a grand principle of code organization. A preferable way to extend types is through composition, such as how `UnderlinedCell` builds on another cell object by simply storing it in a property and forwarding method calls to it in its own methods. ++Inheritance allows us to build slightly different data types from existing data types with relatively little work. It is a fundamental part of the object-oriented tradition, alongside encapsulation and polymorphism. But while the latter two are now generally regarded as wonderful ideas, inheritance is more controversial. ++ ++Whereas encapsulation and polymorphism can be used to _separate_ pieces of code from each other, reducing the tangledness of the overall program, inheritance fundamentally ties classes together, creating _more_ tangle. When inheriting from a class, you usually have to know more about how it works than when simply using it. Inheritance can be a useful tool, and I use it now and then in my own programs, but it shouldn't be the first tool you reach for, and you probably shouldn't actively go looking for opportunities to construct class hierarchies (family trees of classes). + + ## The instanceof operator + +-It is occasionally useful to know whether an object was derived from a specific constructor. For this, JavaScript provides a binary operator called `instanceof`. ++It is occasionally useful to know whether an object was derived from a specific class. For this, JavaScript provides a binary operator called `instanceof`. + + ``` +-console.log(new RTextCell("A") instanceof RTextCell); ++console.log( ++ new SymmetricMatrix(2) instanceof SymmetricMatrix); + // → true +-console.log(new RTextCell("A") instanceof TextCell); ++console.log(new SymmetricMatrix(2) instanceof Matrix); + // → true +-console.log(new TextCell("A") instanceof RTextCell); ++console.log(new Matrix(2, 2) instanceof SymmetricMatrix); + // → false + console.log([1] instanceof Array); + // → true + ``` + +-The operator will see through inherited types. An `RTextCell` is an instance of `TextCell` because `RTextCell.prototype` derives from `TextCell.prototype`. The operator can be applied to standard constructors like `Array`. Almost every object is an instance of `Object`. ++The operator will see through inherited types, so a `SymmetricMatrix` is an instance of `Matrix`. The operator can also be applied to standard constructors like `Array`. Almost every object is an instance of `Object`. + + ## Summary + +-So objects are more complicated than I initially portrayed them. They have prototypes, which are other objects, and will act as if they have properties they don't have as long as the prototype has that property. Simple objects have `Object.prototype` as their prototype. ++So objects do more than just hold their own properties. They have prototypes, which are other objects. They'll act as if they have properties they don't have as long as their prototype has that property. Simple objects have `Object.prototype` as their prototype. ++ ++Constructors, which are functions whose names usually start with a capital letter, can be used with the `new` operator to create new objects. The new object's prototype will be the object found in the `prototype` property of the constructor. You can make good use of this by putting the properties that all values of a given type share into their prototype. There's a `class` notation that provides a clear way to define a constructor and its prototype. + +-Constructors, which are functions whose names usually start with a capital letter, can be used with the `new` operator to create new objects. The new object's prototype will be the object found in the `prototype` property of the constructor function. You can make good use of this by putting the properties that all values of a given type share into their prototype. The `instanceof` operator can, given an object and a constructor, tell you whether that object is an instance of that constructor. ++You can define getters and setters to secretly call methods every time an object's property is accessed. Static methods are methods stored in a class' constructor, rather than its prototype. ++ ++The `instanceof` operator can, given an object and a constructor, tell you whether that object is an instance of that constructor. + + One useful thing to do with objects is to specify an interface for them and tell everybody that they are supposed to talk to your object only through that interface. The rest of the details that make up your object are now _encapsulated_, hidden behind the interface. + +-Once you are talking in terms of interfaces, who says that only one kind of object may implement this interface? Having different objects expose the same interface and then writing code that works on any object with the interface is called _polymorphism_. It is very useful. ++More than one type may implement the same interface. Code written to use an interface automatically knows how to work with any number of different objects that provide the interface. This is called _polymorphism_. + +-When implementing multiple types that differ in only some details, it can be helpful to simply make the prototype of your new type derive from the prototype of your old type and have your new constructor call the old one. This gives you an object type similar to the old type but for which you can add and override properties as you see fit. ++When implementing multiple classes that differ in only some details, it can be helpful to write the new classes as _subclasses_ of an existing class, _inheriting_ part of its behavior. + + ## Exercises + + ### A vector type + +-Write a constructor `Vector` that represents a vector in two-dimensional space. It takes `x` and `y` parameters (numbers), which it should save to properties of the same name. ++Write a class `Vec` that represents a vector in two-dimensional space. It takes `x` and `y` parameters (numbers), which it should save to properties of the same name. + +-Give the `Vector` prototype two methods, `plus` and `minus`, that take another vector as a parameter and return a new vector that has the sum or difference of the two vectors' (the one in `this` and the parameter) _x_ and _y_ values. ++Give the `Vec` prototype two methods, `plus` and `minus`, that take another vector as a parameter and return a new vector that has the sum or difference of the two vectors' (`this` and the parameter) _x_ and _y_ values. + + Add a getter property `length` to the prototype that computes the length of the vector—that is, the distance of the point (_x_, _y_) from the origin (0, 0). + + ``` + // Your code here. + +-console.log(new Vector(1, 2).plus(new Vector(2, 3))); +-// → Vector{x: 3, y: 5} +-console.log(new Vector(1, 2).minus(new Vector(2, 3))); +-// → Vector{x: -1, y: -1} +-console.log(new Vector(3, 4).length); ++console.log(new Vec(1, 2).plus(new Vec(2, 3))); ++// → Vec{x: 3, y: 5} ++console.log(new Vec(1, 2).minus(new Vec(2, 3))); ++// → Vec{x: -1, y: -1} ++console.log(new Vec(3, 4).length); + // → 5 + ``` + +-Your solution can follow the pattern of the `Rabbit` constructor from this chapter quite closely. ++Look back to the `Rabbit` class example if you're unsure how `class` declarations look. ++ ++Adding a getter property to the constructor can be done by putting the word `get` before the method name. To compute the distance from (0, 0) to (x, y), you can use the Pythagorean theorem, which says that the square of the distance we are looking for is equal to the square of the x-coordinate plus the square of the y-coordinate. Thus, √(x<sup>2</sup> + y<sup>2</sup>) is the number you want, and `Math.sqrt` is the way you compute a square root in JavaScript. ++ ++### Groups ++ ++The standard JavaScript environment provides another data structure called `Set`. Like an instance of `Map`, a set holds a collection of values. Unlike `Map`, it does not associate other values with those—it just tracks which values are part of the set. A value can only be part of a set once—adding it again doesn't have any effect. + +-Adding a getter property to the constructor can be done with the `Object.defineProperty` function. To compute the distance from (0, 0) to (x, y), you can use the Pythagorean theorem, which says that the square of the distance we are looking for is equal to the square of the x-coordinate plus the square of the y-coordinate. Thus, √(x<sup>2</sup> + y<sup>2</sup>) is the number you want, and `Math.sqrt` is the way you compute a square root in JavaScript. ++Write a class called `Group` (since `Set` is already taken). Like `Set`, it has `add`, `delete`, and `has` methods. Its constructor creates an empty group, `add` adds a value to the group (but only if it isn't already a member), `delete` removes its argument from the group (if it was a member), and `has` returns a Boolean value indicating whether its argument is a member of the group. + +-### Another cell ++Use the `===` operator, or something equivalent such as `indexOf`, to determine whether two values are the same. + +-Implement a cell type named `StretchCell(inner, width, height)` that conforms to the [table cell interface](06_object.html#table_interface) described earlier in the chapter. It should wrap another cell (like `UnderlinedCell` does) and ensure that the resulting cell has at least the given `width` and `height`, even if the inner cell would naturally be smaller. ++Give the class a static `from` method that takes an iteratable object as argument and creates a group that contains all the values produced by iterating over it. + + ``` +-// Your code here. ++class Group { ++ // Your code here. ++} + +-var sc = new StretchCell(new TextCell("abc"), 1, 2); +-console.log(sc.minWidth()); +-// → 3 +-console.log(sc.minHeight()); +-// → 2 +-console.log(sc.draw(3, 2)); +-// → ["abc", " "] ++let group = Group.from([10, 20]); ++console.log(group.has(10)); ++// → true ++console.log(group.has(30)); ++// → false ++group.add(10); ++group.delete(10); ++console.log(group.has(10)); ++// → false + ``` + +-You'll have to store all three constructor arguments in the instance object. The `minWidth` and `minHeight` methods should call through to the corresponding methods in the `inner` cell but ensure that no number less than the given size is returned (possibly using `Math.max`). ++The easiest way to do this is to store an array of group members in an instance property. The `includes` or `indexOf` methods can be used to check whether a given value is in the array. + +-Don't forget to add a `draw` method that simply forwards the call to the inner cell. ++Your class' constructor can set the member collection to an empty array. When `add` is called, it must check whether the given value is in the array, and add it, for example with `push`, otherwise. + +-### Sequence interface ++Deleting an element from an array, in `delete`, is less straightforward, but you can use `filter` to create a new array without the value. Don't forget to overwrite the property holding the members with the newly filtered version of the array. + +-Design an _interface_ that abstracts iteration over a collection of values. An object that provides this interface represents a sequence, and the interface must somehow make it possible for code that uses such an object to iterate over the sequence, looking at the element values it is made up of and having some way to find out when the end of the sequence is reached. ++The `from` method can use a `for`/`of` loop to get the values out of the iterable object and call `add` to put them into a newly created group. + +-When you have specified your interface, try to write a function `logFive` that takes a sequence object and calls `console.log` on its first five elements—or fewer, if the sequence has fewer than five elements. ++### Iterable groups + +-Then implement an object type `ArraySeq` that wraps an array and allows iteration over the array using the interface you designed. Implement another object type `RangeSeq` that iterates over a range of integers (taking `from` and `to` arguments to its constructor) instead. ++Make the `Group` class from the previous exercise iterable. Refer back to the section about the iterator interface earlier in the chapter if you aren't clear on the exact form of the interface anymore. ++ ++If you used an array to represent the group's members, don't just return the iterator created by calling the `Symbol.iterator` method on the array. That would work, but it defeats the purpose of this exercise. ++ ++It is okay if your iterator behaves strangely when the group is modified during iteration. + + ``` +-// Your code here. ++// Your code here (and the code from the previous exercise) ++ ++for (let value of Group.from(["a", "b", "c"])) { ++ console.log(value); ++} ++// → a ++// → b ++// → c ++``` ++ ++It is probably worthwhile to define a new class `GroupIterator`. Iterator instances should have a property that tracks the current position in the group. Every time `next` is called, it checks whether it is done, and if not, moves past the current value and returns it. ++ ++The `Group` class itself gets a method named by `Symbol.iterator` that, when called, returns a new instance of the iterator class for that group. ++ ++### Borrowing a method ++ ++Earlier in the chapter I mentioned that an object's `hasOwnProperty` can be used as a more robust alternative to the `in` operator when you want to ignore the prototype's properties. But what if your map needs to include the word `"hasOwnProperty"`? You won't be able to call that method anymore, because the object's own property hides the method value. ++ ++Can you think of a way to call `hasOwnProperty` on an object that has its own property by that name? + +-logFive(new ArraySeq([1, 2])); +-// → 1 +-// → 2 +-logFive(new RangeSeq(100, 1000)); +-// → 100 +-// → 101 +-// → 102 +-// → 103 +-// → 104 + ``` ++let map = {one: true, two: true, hasOwnProperty: true}; + +-One way to solve this is to give the sequence objects _state_, meaning their properties are changed in the process of using them. You could store a counter that indicates how far the sequence object has advanced. ++// Fix this call ++console.log(map.hasOwnProperty("one")); ++// → true ++``` + +-Your interface will need to expose at least a way to get the next element and to find out whether the iteration has reached the end of the sequence yet. It is tempting to roll these into one method, `next`, which returns `null` or `undefined` when the sequence is at its end. But now you have a problem when a sequence actually contains `null`. So a separate method (or getter property) to find out whether the end has been reached is probably preferable. ++Remember that methods that exist on plain objects come from `Object.prototype`. + +-Another solution is to avoid changing state in the object. You can expose a method for getting the current element (without advancing any counter) and another for getting a new sequence that represents the remaining elements after the current one (or a special value if the end of the sequence is reached). This is quite elegant—a sequence value will “stay itself” even after it is used and can thus be shared with other code without worrying about what might happen to it. It is, unfortunately, also somewhat inefficient in a language like JavaScript because it involves creating a lot of objects during iteration. ++And that you can call a function with a specific `this` binding by using its `call` method. diff --git a/diff-en/2ech8-3ech8.diff b/diff-en/2ech8-3ech8.diff new file mode 100644 index 0000000..c4d5977 --- /dev/null +++ b/diff-en/2ech8-3ech8.diff @@ -0,0 +1,625 @@ +diff --git a/2ech8.md b/3ech8.md +index bcf3eba..7d57bed 100644 +--- a/2ech8.md ++++ b/3ech8.md +@@ -1,28 +1,20 @@ +-# Chapter 8Bugs and Error Handling ++# Chapter 8Bugs and Errors + + > Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. + > + > <footer>Brian Kernighan and P.J. Plauger, <cite>The Elements of Programming Style</cite></footer> + +-> Yuan-Ma had written a small program that used many global variables and shoddy shortcuts. Reading it, a student asked, ‘You warned us against these techniques, yet I find them in your program. How can this be?' The master said, ‘There is no need to fetch a water hose when the house is not on fire.' +-> +-> <footer>Master Yuan-Ma, <cite>The Book of Programming</cite></footer> +- +-A program is crystallized thought. Sometimes those thoughts are confused. Other times, mistakes are introduced when converting thought into code. Either way, the result is a flawed program. +- +-Flaws in a program are usually called bugs. Bugs can be programmer errors or problems in other systems that the program interacts with. Some bugs are immediately apparent, while others are subtle and might remain hidden in a system for years. ++Flaws in computer programs are usually called _bugs_. It makes programmers feel good to imagine them as little things that just happen to crawl into our work. In reality, of course, we put them there ourselves. + +-Often, problems surface only when a program encounters a situation that the programmer didn't originally consider. Sometimes such situations are unavoidable. When the user is asked to input their age and types _orange_, this puts our program in a difficult position. The situation has to be anticipated and handled somehow. ++If a program is crystallized thought, you can roughly categorize bugs into those caused by the thoughts being confused, and those caused by mistakes introduced while converting a thought to code. The former type is generally harder to diagnose and fix than the latter. + +-## Programmer mistakes ++## Language + +-When it comes to programmer mistakes, our aim is simple. We want to find them and fix them. Such mistakes can range from simple typos that cause the computer to complain as soon as it lays eyes on our program to subtle mistakes in our understanding of the way the program operates, causing incorrect outcomes only in specific situations. Bugs of the latter type can take weeks to diagnose. ++Many mistakes could automatically be pointed out to us by the computer, if it knew enough about what we're trying to do. But here JavaScript's looseness is a hindrance. Its concept of bindings and properties is vague enough that it will rarely catch typos before actually running the program. And even then, it allows you to do some clearly nonsensical things without complaint, such as computing `true * "monkey"`. + +-The degree to which languages help you find such mistakes varies. Unsurprisingly, JavaScript is at the “hardly helps at all” end of that scale. Some languages want to know the types of all your variables and expressions before even running a program and will tell you right away when a type is used in an inconsistent way. JavaScript considers types only when actually running the program, and even then, it allows you to do some clearly nonsensical things without complaint, such as `x = true * "monkey"`. ++There are some things that JavaScript does complain about. Writing a program that does not follow the language's grammar will immediately make the computer complain. Other things, such as calling something that's not a function or looking up a property on an undefined value, will cause an error to be reported when the program tries to perform the action. + +-There are some things that JavaScript does complain about, though. Writing a program that is not syntactically valid will immediately trigger an error. Other things, such as calling something that's not a function or looking up a property on an undefined value, will cause an error to be reported when the program is running and encounters the nonsensical action. +- +-But often, your nonsense computation will simply produce a `NaN` (not a number) or undefined value. And the program happily continues, convinced that it's doing something meaningful. The mistake will manifest itself only later, after the bogus value has traveled through several functions. It might not trigger an error at all but silently cause the program's output to be wrong. Finding the source of such problems can be difficult. ++But often, your nonsense computation will merely produce `NaN` (not a number) or an undefined value. And the program happily continues, convinced that it's doing something meaningful. The mistake will manifest itself only later, after the bogus value has traveled through several functions. It might not trigger an error at all but silently cause the program's output to be wrong. Finding the source of such problems can be difficult. + + The process of finding mistakes—bugs—in programs is called _debugging_. + +@@ -33,81 +25,97 @@ JavaScript can be made a _little_ more strict by enabling _strict mode_. This is + ``` + function canYouSpotTheProblem() { + "use strict"; +- for (counter = 0; counter < 10; counter++) ++ for (counter = 0; counter < 10; counter++) { + console.log("Happy happy"); ++ } + } + + canYouSpotTheProblem(); + // → ReferenceError: counter is not defined + ``` + +-Normally, when you forget to put `var` in front of your variable, as with `counter` in the example, JavaScript quietly creates a global variable and uses that. In strict mode, however, an error is reported instead. This is very helpful. It should be noted, though, that this doesn't work when the variable in question already exists as a global variable, but only when assigning to it would have created it. ++Normally, when you forget to put `let` in front of your binding, as with `counter` in the example, JavaScript quietly creates a global binding and uses that. In strict mode an error is reported instead. This is very helpful. It should be noted, though, that this doesn't work when the binding in question already exists as a global binding. In that case, the loop will still quietly overwrite the value of the binding. + +-Another change in strict mode is that the `this` binding holds the value `undefined` in functions that are not called as methods. When making such a call outside of strict mode, `this` refers to the global scope object. So if you accidentally call a method or constructor incorrectly in strict mode, JavaScript will produce an error as soon as it tries to read something from `this`, rather than happily working with the global object, creating and reading global variables. ++Another change in strict mode is that the `this` binding holds the value `undefined` in functions that are not called as methods. When making such a call outside of strict mode, `this` refers to the global scope object, which is an object whose properties are the global bindings. So if you accidentally call a method or constructor incorrectly in strict mode, JavaScript will produce an error as soon as it tries to read something from `this`, rather than happily writing to the global scope. + +-For example, consider the following code, which calls a constructor without the `new` keyword so that its `this` will _not_ refer to a newly constructed object: ++For example, consider the following code, which calls a constructor function without the `new` keyword so that its `this` will _not_ refer to a newly constructed object: + + ``` + function Person(name) { this.name = name; } +-var ferdinand = Person("Ferdinand"); // oops ++let ferdinand = Person("Ferdinand"); // oops + console.log(name); + // → Ferdinand + ``` + +-So the bogus call to `Person` succeeded but returned an undefined value and created the global variable `name`. In strict mode, the result is different. ++So the bogus call to `Person` succeeded but returned an undefined value and created the global binding `name`. In strict mode, the result is different. + + ``` + "use strict"; + function Person(name) { this.name = name; } +-// Oops, forgot 'new' +-var ferdinand = Person("Ferdinand"); ++let ferdinand = Person("Ferdinand"); // forgot new + // → TypeError: Cannot set property 'name' of undefined + ``` + + We are immediately told that something is wrong. This is helpful. + +-Strict mode does a few more things. It disallows giving a function multiple parameters with the same name and removes certain problematic language features entirely (such as the `with` statement, which is so misguided it is not further discussed in this book). ++Fortunately, constructors created with the `class` notation will always complain if they are called without `new`, making this less of a problem even in non-strict mode. + +-In short, putting a `"use strict"` at the top of your program rarely hurts and might help you spot a problem. ++Strict mode does a few more things. It disallows giving a function multiple parameters with the same name and removes certain problematic language features entirely (such as the `with` statement, which is so wrong it is not further discussed in this book). + +-## Testing ++In short, putting `"use strict"` at the top of your program rarely hurts and might help you spot a problem. + +-If the language is not going to do much to help us find mistakes, we'll have to find them the hard way: by running the program and seeing whether it does the right thing. ++## Types + +-Doing this by hand, again and again, is a sure way to drive yourself insane. Fortunately, it is often possible to write a second program that automates testing your actual program. ++Some languages want to know the types of all your bindings and expressions before even running a program. They will tell you right away when a type is used in an inconsistent way. JavaScript considers types only when actually running the program, and even there often tries to implicitly convert values to the type it expects, so it's not much help. + +-As an example, we once again use the `Vector` type. ++Still, types provide a useful framework for talking about programs. A lot of mistakes come from being confused about the kind of value that goes into or comes out of a function. If you have that information written down, you're less likely to get confused. ++ ++You could add a comment like this above the `goalOrientedRobot` function from the last chapter, to describe its type. + + ``` +-function Vector(x, y) { +- this.x = x; +- this.y = y; ++// (WorldState, Array) → {direction: string, memory: Array} ++function goalOrientedRobot(state, memory) { ++ // ... + } +-Vector.prototype.plus = function(other) { +- return new Vector(this.x + other.x, this.y + other.y); +-}; + ``` + +-We will write a program to check that our implementation of `Vector` works as intended. Then, every time we change the implementation, we follow up by running the test program so that we can be reasonably confident that we didn't break anything. When we add extra functionality (for example, a new method) to the `Vector` type, we also add tests for the new feature. ++There are a number of different conventions for annotating JavaScript programs with types. + +-``` +-function testVector() { +- var p1 = new Vector(10, 20); +- var p2 = new Vector(-10, 5); +- var p3 = p1.plus(p2); ++One thing about types is that they need to introduce their own complexity to be able to describe enough code to be useful. What do you think would be the type of the `randomPick` function that returns a random element from an array? You'd need to introduce a _type variable_, _T_, which can stand in for any type, so that you can give `randomPick` a type like `([T]) → T` (function from an array of _T_s to a _T_). ++ ++When the types of a program are known, it is possible for the computer to _check_ them for you, pointing out mistakes before the program is run. There are several JavaScript dialects that add types to the language and check them. The most popular one is called [TypeScript](https://www.typescriptlang.org/). If you are interested in adding more rigor to your programs, I recommend you give it a try. ++ ++In this book, we'll continue using raw, dangerous, untyped JavaScript code. ++ ++## Testing ++ ++If the language is not going to do much to help us find mistakes, we'll have to find them the hard way: by running the program and seeing whether it does the right thing. ++ ++Doing this by hand, again and again, is a really bad idea. Not only is it annoying, it also tends to be ineffective, since it takes too much time to exhaustively test everything every time you make a change. ++ ++Computers are good at repetitive tasks, and testing is the ideal repetitive task. Automated testing is the process of writing a program that tests another program. Writing tests is a bit more work than testing manually, but once you've done it you gain a kind of superpower: it only takes you a few seconds to verify that your program still behaves properly in all the situations you wrote tests for. When you break something, you'll immediately notice, rather than randomly running into it at some later time. + +- if (p1.x !== 10) return "fail: x property"; +- if (p1.y !== 20) return "fail: y property"; +- if (p2.x !== -10) return "fail: negative x property"; +- if (p3.x !== 0) return "fail: x from plus"; +- if (p3.y !== 25) return "fail: y from plus"; +- return "everything ok"; ++Tests usually take the form of little labeled programs that verify some aspect of your code. For example, a set of tests for the (standard, probably already tested by someone else) `toUpperCase` method might look like this: ++ ++``` ++function test(label, body) { ++ if (!body()) console.log(`Failed: ${label}`); + } +-console.log(testVector()); +-// → everything ok ++ ++test("convert Latin text to uppercase", () => { ++ return "hello".toUpperCase() == "HELLO"; ++}); ++test("convert Greek text to uppercase", () => { ++ return "Χαίρετε".toUpperCase() == "ΧΑΊΡΕΤΕ"; ++}); ++test("don't convert case-less characters", () => { ++ return "مرحبا".toUpperCase() == "مرحبا"; ++}); + ``` + +-Writing tests like this tends to produce rather repetitive, awkward code. Fortunately, there exist pieces of software that help you build and run collections of tests (_test suites_) by providing a language (in the form of functions and methods) suited to expressing tests and by outputting informative information when a test fails. These are called _testing frameworks_. ++Writing tests like this tends to produce rather repetitive, awkward code. Fortunately, there exist pieces of software that help you build and run collections of tests (_test suites_) by providing a language (in the form of functions and methods) suited to expressing tests and by outputting informative information when a test fails. These are usually called _test runners_. ++ ++Some code is easier to test than other code. Generally, the more external objects that the code interacts with, the harder it is to set up the context in which to test it. The style of programming shown in the [previous chapter](07_robot.html), which uses self-contained persistent values rather than changing objects, tends to be easy to test. + + ## Debugging + +@@ -115,13 +123,13 @@ Once you notice that there is something wrong with your program because it misbe + + Sometimes it is obvious. The error message will point at a specific line of your program, and if you look at the error description and that line of code, you can often see the problem. + +-But not always. Sometimes the line that triggered the problem is simply the first place where a bogus value produced elsewhere gets used in an invalid way. And sometimes there is no error message at all—just an invalid result. If you have been solving the exercises in the earlier chapters, you will probably have already experienced such situations. ++But not always. Sometimes the line that triggered the problem is simply the first place where a flaky value produced elsewhere gets used in an invalid way. If you have been solving the exercises in earlier chapters, you will probably have already experienced such situations. + +-The following example program tries to convert a whole number to a string in any base (decimal, binary, and so on) by repeatedly picking out the last digit and then dividing the number to get rid of this digit. But the insane output that it currently produces suggests that it has a bug. ++The following example program tries to convert a whole number to a string in a given base (decimal, binary, and so on) by repeatedly picking out the last digit and then dividing the number to get rid of this digit. But the strange output that it currently produces suggests that it has a bug. + + ``` +-function numberToString(n, base) { +- var result = "", sign = ""; ++function numberToString(n, base = 10) { ++ let result = "", sign = ""; + if (n < 0) { + sign = "-"; + n = -n; +@@ -138,7 +146,7 @@ console.log(numberToString(13, 10)); + + Even if you see the problem already, pretend for a moment that you don't. We know that our program is malfunctioning, and we want to find out why. + +-This is where you must resist the urge to start making random changes to the code. Instead, _think_. Analyze what is happening and come up with a theory of why it might be happening. Then, make additional observations to test this theory—or, if you don't yet have a theory, make additional observations that might help you come up with one. ++This is where you must resist the urge to start making random changes to the code to see if that makes it better. Instead, _think_. Analyze what is happening and come up with a theory of why it might be happening. Then, make additional observations to test this theory—or, if you don't yet have a theory, make additional observations to help you come up with one. + + Putting a few strategic `console.log` calls into the program is a good way to get additional information about what the program is doing. In this case, we want `n` to take the values `13`, `1`, and then `0`. Let's write out its value at the start of the loop. + +@@ -151,59 +159,72 @@ Putting a few strategic `console.log` calls into the program is a good way to ge + 1.5e-323 + ``` + +-_Right_. Dividing 13 by 10 does not produce a whole number. Instead of `n /= base`, what we actually want is `n = Math.floor(n / base)` so that the number is properly “shifted” to the right. ++_Right_. Dividing 13 by 10 does not produce a whole number. Instead of `n /= base`, what we actually want is `n = Math.<wbr>floor(n /<wbr> base)` so that the number is properly “shifted” to the right. + +-An alternative to using `console.log` is to use the _debugger_ capabilities of your browser. Modern browsers come with the ability to set a _breakpoint_ on a specific line of your code. This will cause the execution of the program to pause every time the line with the breakpoint is reached and allow you to inspect the values of variables at that point. I won't go into details here since debuggers differ from browser to browser, but look in your browser's developer tools and search the Web for more information. Another way to set a breakpoint is to include a `debugger` statement (consisting of simply that keyword) in your program. If the developer tools of your browser are active, the program will pause whenever it reaches that statement, and you will be able to inspect its state. ++An alternative to using `console.log` to peek into the program's behavior is to use the _debugger_ capabilities of your browser. Browsers come with the ability to set a _breakpoint_ on a specific line of your code. When the execution of the program reaches a line with a breakpoint, it is paused, and you can inspect the values of bindings at that point. I won't go into details, as debuggers differ from browser to browser, but look in your browser's developer tools or search the Web for more information. ++ ++Another way to set a breakpoint is to include a `debugger` statement (consisting of simply that keyword) in your program. If the developer tools of your browser are active, the program will pause whenever it reaches such a statement. + + ## Error propagation + +-Not all problems can be prevented by the programmer, unfortunately. If your program communicates with the outside world in any way, there is a chance that the input it gets will be invalid or that other systems that it tries to talk to are broken or unreachable. ++Not all problems can be prevented by the programmer, unfortunately. If your program communicates with the outside world in any way, it is possible to get malformed input, to become overloaded with work, or to have the network fail. + +-Simple programs, or programs that run only under your supervision, can afford to just give up when such a problem occurs. You'll look into the problem and try again. “Real” applications, on the other hand, are expected to not simply crash. Sometimes the right thing to do is take the bad input in stride and continue running. In other cases, it is better to report to the user what went wrong and then give up. But in either situation, the program has to actively do something in response to the problem. ++If you're only programming for yourself, you can afford to just ignore such problems until they occur. But if you build something that is going to be used by anybody else, you usually want the program to do better than just crash. Sometimes the right thing to do is take the bad input in stride and continue running. In other cases, it is better to report to the user what went wrong and then give up. But in either situation, the program has to actively do something in response to the problem. + +-Say you have a function `promptInteger` that asks the user for a whole number and returns it. What should it return if the user inputs _orange_? ++Say you have a function `promptInteger` that asks the user for a whole number and returns it. What should it return if the user inputs “orange”? + +-One option is to make it return a special value. Common choices for such values are `null` and `undefined`. ++One option is to make it return a special value. Common choices for such values are `null`, `undefined`, or -1. + + ``` + function promptNumber(question) { +- var result = Number(prompt(question, "")); +- if (isNaN(result)) return null; ++ let result = Number(prompt(question)); ++ if (Number.isNaN(result)) return null; + else return result; + } + + console.log(promptNumber("How many trees do you see?")); + ``` + +-This is a sound strategy. Now any code that calls `promptNumber` must check whether an actual number was read and, failing that, must somehow recover—maybe by asking again or by filling in a default value. Or it could again return a special value to _its_ caller to indicate that it failed to do what it was asked. ++Now any code that calls `promptNumber` must check whether an actual number was read and, failing that, must somehow recover—maybe by asking again or by filling in a default value. Or it could again return a special value to _its_ caller to indicate that it failed to do what it was asked. ++ ++In many situations, mostly when errors are common and the caller should be explicitly taking them into account, returning a special value is a good way to indicate an error. It does, however, have its downsides. First, what if the function can already return every possible kind of value? In such a function, you'll have to do something like wrap the result in an object to be able to distinguish success from failure. + +-In many situations, mostly when errors are common and the caller should be explicitly taking them into account, returning a special value is a perfectly fine way to indicate an error. It does, however, have its downsides. First, what if the function can already return every possible kind of value? For such a function, it is hard to find a special value that can be distinguished from a valid result. ++``` ++function lastElement(array) { ++ if (array.length == 0) { ++ return {failed: true}; ++ } else { ++ return {element: array[array.length - 1]}; ++ } ++} ++``` + +-The second issue with returning special values is that it can lead to some very cluttered code. If a piece of code calls `promptNumber` 10 times, it has to check 10 times whether `null` was returned. And if its response to finding `null` is to simply return `null` itself, the caller will in turn have to check for it, and so on. ++The second issue with returning special values is that it can lead to very awkward code. If a piece of code calls `promptNumber` 10 times, it has to check 10 times whether `null` was returned. And if its response to finding `null` is to simply return `null` itself, callers of the function will in turn have to check for it, and so on. + + ## Exceptions + +-When a function cannot proceed normally, what we would _like_ to do is just stop what we are doing and immediately jump back to a place that knows how to handle the problem. This is what _exception handling_ does. ++When a function cannot proceed normally, what we would _like_ to do is just stop what we are doing and immediately jump to a place that knows how to handle the problem. This is what _exception handling_ does. + +-Exceptions are a mechanism that make it possible for code that runs into a problem to _raise_ (or _throw_) an exception, which is simply a value. Raising an exception somewhat resembles a super-charged return from a function: it jumps out of not just the current function but also out of its callers, all the way down to the first call that started the current execution. This is called _unwinding the stack_. You may remember the stack of function calls that was mentioned in [Chapter 3](03_functions.html#stack). An exception zooms down this stack, throwing away all the call contexts it encounters. ++Exceptions are a mechanism that makes it possible for code that runs into a problem to _raise_ (or _throw_) an exception. An exception can be any value. Raising one somewhat resembles a super-charged return from a function: it jumps out of not just the current function but also out of its callers, all the way down to the first call that started the current execution. This is called _unwinding the stack_. You may remember the stack of function calls that was mentioned in [Chapter 3](03_functions.html#stack). An exception zooms down this stack, throwing away all the call contexts it encounters. + +-If exceptions always zoomed right down to the bottom of the stack, they would not be of much use. They would just provide a novel way to blow up your program. Their power lies in the fact that you can set “obstacles” along the stack to _catch_ the exception as it is zooming down. Then you can do something with it, after which the program continues running at the point where the exception was caught. ++If exceptions always zoomed right down to the bottom of the stack, they would not be of much use. They'd just provide a novel way to blow up your program. Their power lies in the fact that you can set “obstacles” along the stack to _catch_ the exception as it is zooming down. Once you've caught an exception, you can do something with it to address the problem, and then continue to run the program. + + Here's an example: + + ``` + function promptDirection(question) { +- var result = prompt(question, ""); ++ let result = prompt(question); + if (result.toLowerCase() == "left") return "L"; + if (result.toLowerCase() == "right") return "R"; + throw new Error("Invalid direction: " + result); + } + + function look() { +- if (promptDirection("Which way?") == "L") ++ if (promptDirection("Which way?") == "L") { + return "a house"; +- else ++ } else { + return "two angry bears"; ++ } + } + + try { +@@ -213,85 +234,97 @@ try { + } + ``` + +-The `throw` keyword is used to raise an exception. Catching one is done by wrapping a piece of code in a `try` block, followed by the keyword `catch`. When the code in the `try` block causes an exception to be raised, the `catch` block is evaluated. The variable name (in parentheses) after `catch` will be bound to the exception value. After the `catch` block finishes—or if the `try` block finishes without problems—control proceeds beneath the entire `try/catch` statement. ++The `throw` keyword is used to raise an exception. Catching one is done by wrapping a piece of code in a `try` block, followed by the keyword `catch`. When the code in the `try` block causes an exception to be raised, the `catch` block is evaluated, with the name in parentheses bound to the exception value. After the `catch` block finishes—or if the `try` block finishes without problems—the program proceeds beneath the entire `try/catch` statement. + +-In this case, we used the `Error` constructor to create our exception value. This is a standard JavaScript constructor that creates an object with a `message` property. In modern JavaScript environments, instances of this constructor also gather information about the call stack that existed when the exception was created, a so-called _stack trace_. This information is stored in the `stack` property and can be helpful when trying to debug a problem: it tells us the precise function where the problem occurred and which other functions led up to the call that failed. ++In this case, we used the `Error` constructor to create our exception value. This is a standard JavaScript constructor that creates an object with a `message` property. In most JavaScript environments, instances of this constructor also gather information about the call stack that existed when the exception was created, a so-called _stack trace_. This information is stored in the `stack` property and can be helpful when trying to debug a problem: it tells us the function where the problem occurred and which functions made the failing call. + +-Note that the function `look` completely ignores the possibility that `promptDirection` might go wrong. This is the big advantage of exceptions—error-handling code is necessary only at the point where the error occurs and at the point where it is handled. The functions in between can forget all about it. ++Note that the `look` function completely ignores the possibility that `promptDirection` might go wrong. This is the big advantage of exceptions: Error-handling code is necessary only at the point where the error occurs and at the point where it is handled. The functions in between can forget all about it. + + Well, almost... + + ## Cleaning up after exceptions + +-Consider the following situation: a function, `withContext`, wants to make sure that, during its execution, the top-level variable `context` holds a specific context value. After it finishes, it restores this variable to its old value. ++The effect of an exception is another kind of control flow. Every action that might cause an exception, which is pretty much every function call and property access, might cause control to suddenly leave your code. ++ ++That means that when code has several side effects, even if its “regular” control flow looks like they'll always all happen, an exception might prevent some of them from taking place. ++ ++Here is some really bad banking code. + + ``` +-var context = null; ++const accounts = { ++ a: 100, ++ b: 0, ++ c: 20 ++}; + +-function withContext(newContext, body) { +- var oldContext = context; +- context = newContext; +- var result = body(); +- context = oldContext; +- return result; ++function getAccount() { ++ let accountName = prompt("Enter an account name"); ++ if (!accounts.hasOwnProperty(accountName)) { ++ throw new Error(`No such account: ${accountName}`); ++ } ++ return accountName; ++} ++ ++function transfer(from, amount) { ++ if (accounts[from] < amount) return; ++ accounts[from] -= amount; ++ accounts[getAccount()] += amount; + } + ``` + +-What if `body` raises an exception? In that case, the call to `withContext` will be thrown off the stack by the exception, and `context` will never be set back to its old value. ++The `transfer` function transfers a sum of money from a given account to another, asking for the name of the other account in the process. If given an invalid account name, `getAccount` throws an exception. ++ ++But `transfer` _first_ removes the money from the account, and _then_ calls `getAccount` before it adds it to another account. If it is broken off by an exception at that point, it'll just make the money disappear. ++ ++That code could have been written a little more intelligently, for example by calling `getAccount` before it starts moving money around. But often problems like this occur in more subtle ways. Even functions that don't look like they will throw an exception might do so in exceptional circumstances or when they contain a programmer mistake. + +-There is one more feature that `try` statements have. They may be followed by a `finally` block either instead of or in addition to a `catch` block. A `finally` block means “No matter _what_ happens, run this code after trying to run the code in the `try` block”. If a function has to clean something up, the cleanup code should usually be put into a `finally` block. ++One way to address this is to use fewer side effects. Again, a programming style that computes new values instead of changing existing data helps. If a piece of code stops running in the middle of creating a new value, no one ever sees the half-finished value, and there is no problem. ++ ++But that isn't always practical. So there is another feature that `try` statements have. They may be followed by a `finally` block either instead of or in addition to a `catch` block. A `finally` block says “no matter _what_ happens, run this code after trying to run the code in the `try` block.” + + ``` +-function withContext(newContext, body) { +- var oldContext = context; +- context = newContext; ++function transfer(from, amount) { ++ if (accounts[from] < amount) return; ++ let progress = 0; + try { +- return body(); ++ accounts[from] -= amount; ++ progress = 1; ++ accounts[getAccount()] += amount; ++ progress = 2; + } finally { +- context = oldContext; ++ if (progress == 1) { ++ accounts[from] += amount; ++ } + } + } + ``` + +-Note that we no longer have to store the result of `body` (which we want to return) in a variable. Even if we return directly from the `try` block, the `finally` block will be run. Now we can do this and be safe: +- +-``` +-try { +- withContext(5, function() { +- if (context < 10) +- throw new Error("Not enough context!"); +- }); +-} catch (e) { +- console.log("Ignoring: " + e); +-} +-// → Ignoring: Error: Not enough context! ++This version of the function tracks its progress, and if, when leaving, it notices that it was aborted at a point where it had created an inconsistent program state, it repairs the damage it did. + +-console.log(context); +-// → null +-``` ++Note that, even though the `finally` code is run when an exception leaves the `try` block, it does not interfere with the exception. After the `finally` block runs, the stack continues unwinding. + +-Even though the function called from `withContext` exploded, `withContext` itself still properly cleaned up the `context` variable. ++Writing programs that operate reliably even when exceptions pop up in unexpected places is very hard. Many people simply don't bother, and because exceptions are typically reserved for exceptional circumstances, the problem may occur so rarely that it is never even noticed. Whether that is a good thing or a really bad thing depends on how much damage the software will do when it fails. + + ## Selective catching + +-When an exception makes it all the way to the bottom of the stack without being caught, it gets handled by the environment. What this means differs between environments. In browsers, a description of the error typically gets written to the JavaScript console (reachable through the browser's Tools or Developer menu). ++When an exception makes it all the way to the bottom of the stack without being caught, it gets handled by the environment. What this means differs between environments. In browsers, a description of the error typically gets written to the JavaScript console (reachable through the browser's Tools or Developer menu). Node.js, the browserless JavaScript environment we will discuss in [Chapter 20](20_node.html), is more careful about data corruption. It aborts the whole process when an unhandled exception occurs. + +-For programmer mistakes or problems that the program cannot possibly handle, just letting the error go through is often okay. An unhandled exception is a reasonable way to signal a broken program, and the JavaScript console will, on modern browsers, provide you with some information about which function calls were on the stack when the problem occurred. ++For programmer mistakes, just letting the error go through is often the best you can do. An unhandled exception is a reasonable way to signal a broken program, and the JavaScript console will, on modern browsers, provide you with some information about which function calls were on the stack when the problem occurred. + +-For problems that are _expected_ to happen during routine use, crashing with an unhandled exception is not a very friendly response. ++For problems that are _expected_ to happen during routine use, crashing with an unhandled exception is a terrible strategy. + +-Invalid uses of the language, such as referencing a nonexistent variable, looking up a property on `null`, or calling something that's not a function, will also result in exceptions being raised. Such exceptions can be caught just like your own exceptions. ++Invalid uses of the language, such as referencing a nonexistent binding, looking up a property on `null`, or calling something that's not a function, will also result in exceptions being raised. Such exceptions can also be caught. + + When a `catch` body is entered, all we know is that _something_ in our `try` body caused an exception. But we don't know _what_, or _which_ exception it caused. + +-JavaScript (in a rather glaring omission) doesn't provide direct support for selectively catching exceptions: either you catch them all or you don't catch any. This makes it very easy to _assume_ that the exception you get is the one you were thinking about when you wrote the `catch` block. ++JavaScript (in a rather glaring omission) doesn't provide direct support for selectively catching exceptions: either you catch them all or you don't catch any. This makes it tempting to _assume_ that the exception you get is the one you were thinking about when you wrote the `catch` block. + +-But it might not be. Some other assumption might be violated, or you might have introduced a bug somewhere that is causing an exception. Here is an example, which _attempts_ to keep on calling `promptDirection` until it gets a valid answer: ++But it might not be. Some other assumption might be violated, or you might have introduced a bug that is causing an exception. Here is an example that _attempts_ to keep on calling `promptDirection` until it gets a valid answer: + + ``` + for (;;) { + try { +- var dir = promtDirection("Where?"); // ← typo! ++ let dir = promtDirection("Where?"); // ← typo! + console.log("You chose ", dir); + break; + } catch (e) { +@@ -300,108 +333,93 @@ for (;;) { + } + ``` + +-The `for (;;)` construct is a way to intentionally create a loop that doesn't terminate on its own. We break out of the loop only when a valid direction is given. _But_ we misspelled `promptDirection`, which will result in an “undefined variable” error. Because the `catch` block completely ignores its exception value (`e`), assuming it knows what the problem is, it wrongly treats the variable error as indicating bad input. Not only does this cause an infinite loop, but it also “buries” the useful error message about the misspelled variable. ++The `for (;;)` construct is a way to intentionally create a loop that doesn't terminate on its own. We break out of the loop only when a valid direction is given. _But_ we misspelled `promptDirection`, which will result in an “undefined variable” error. Because the `catch` block completely ignores its exception value (`e`), assuming it knows what the problem is, it wrongly treats the binding error as indicating bad input. Not only does this cause an infinite loop, it also “buries” the useful error message about the misspelled binding. + + As a general rule, don't blanket-catch exceptions unless it is for the purpose of “routing” them somewhere—for example, over the network to tell another system that our program crashed. And even then, think carefully about how you might be hiding information. + +-So we want to catch a _specific_ kind of exception. We can do this by checking in the `catch` block whether the exception we got is the one we are interested in and by rethrowing it otherwise. But how do we recognize an exception? ++So we want to catch a _specific_ kind of exception. We can do this by checking in the `catch` block whether the exception we got is the one we are interested in and rethrowing it otherwise. But how do we recognize an exception? + +-Of course, we could match its `message` property against the error message we happen to expect. But that's a shaky way to write code—we'd be using information that's intended for human consumption (the message) to make a programmatic decision. As soon as someone changes (or translates) the message, the code will stop working. ++We could compare its `message` property against the error message we happen to expect. But that's a shaky way to write code—we'd be using information that's intended for human consumption (the message) to make a programmatic decision. As soon as someone changes (or translates) the message, the code will stop working. + + Rather, let's define a new type of error and use `instanceof` to identify it. + + ``` +-function InputError(message) { +- this.message = message; +- this.stack = (new Error()).stack; +-} +-InputError.prototype = Object.create(Error.prototype); +-InputError.prototype.name = "InputError"; +-``` +- +-The prototype is made to derive from `Error.prototype` so that `instanceof Error` will also return true for `InputError` objects. It's also given a `name` property since the standard error types (`Error`, `SyntaxError`, `ReferenceError`, and so on) also have such a property. +- +-The assignment to the `stack` property tries to give this object a somewhat useful stack trace, on platforms that support it, by creating a regular error object and then using that object's `stack` property as its own. +- +-Now `promptDirection` can throw such an error. ++class InputError extends Error {} + +-``` + function promptDirection(question) { +- var result = prompt(question, ""); ++ let result = prompt(question); + if (result.toLowerCase() == "left") return "L"; + if (result.toLowerCase() == "right") return "R"; + throw new InputError("Invalid direction: " + result); + } + ``` + +-And the loop can catch it more carefully. ++The new error class extends `Error`. It doesn't define its own constructor, which means that it inherits the `Error` constructor, which expects a string message as argument. In fact, it doesn't define anything at all—the class is empty. `InputError` objects behave like `Error` objects, except that they have a different class by which we can recognize them. ++ ++Now the loop can catch these more carefully. + + ``` + for (;;) { + try { +- var dir = promptDirection("Where?"); ++ let dir = promptDirection("Where?"); + console.log("You chose ", dir); + break; + } catch (e) { +- if (e instanceof InputError) ++ if (e instanceof InputError) { + console.log("Not a valid direction. Try again."); +- else ++ } else { + throw e; ++ } + } + } + ``` + +-This will catch only instances of `InputError` and let unrelated exceptions through. If you reintroduce the typo, the undefined variable error will be properly reported. ++This will catch only instances of `InputError` and let unrelated exceptions through. If you reintroduce the typo, the undefined binding error will be properly reported. + + ## Assertions + +-_Assertions_ are a tool to do basic sanity checking for programmer errors. Consider this helper function, `assert`: ++_Assertions_ are checks inside a program that verify that something is the way it is supposed to be. They are used not to handle situations that can come up in normal operation, but to find programmer mistakes. + +-``` +-function AssertionFailed(message) { +- this.message = message; +-} +-AssertionFailed.prototype = Object.create(Error.prototype); ++If, for example, `firstElement` is described as a function that should never be called on empty arrays, we might write it like this: + +-function assert(test, message) { +- if (!test) +- throw new AssertionFailed(message); +-} +- +-function lastElement(array) { +- assert(array.length > 0, "empty array in lastElement"); +- return array[array.length - 1]; ++``` ++function firstElement(array) { ++ if (array.length == 0) { ++ throw new Error("firstElement called with []"); ++ } ++ return array[0]; + } + ``` + +-This provides a compact way to enforce expectations, helpfully blowing up the program if the stated condition does not hold. For instance, the `lastElement` function, which fetches the last element from an array, would return `undefined` on empty arrays if the assertion was omitted. Fetching the last element from an empty array does not make much sense, so it is almost certainly a programmer error to do so. ++Now, instead of silently returning undefined (which you get when reading an array property that does not exist), this will loudly blow up your program as soon as you misuse it. This makes it less likely for such mistakes to go unnoticed, and easier to find their cause when they occur. + +-Assertions are a way to make sure mistakes cause failures at the point of the mistake, rather than silently producing nonsense values that may go on to cause trouble in an unrelated part of the system. ++I do not recommend trying to write assertions for every possible kind of bad input. That'd be a lot of work and would lead to very noisy code. You'll want to reserve them for mistakes that are easy to make (or that you find yourself making). + + ## Summary + +-Mistakes and bad input are facts of life. Bugs in programs need to be found and fixed. They can become easier to notice by having automated test suites and adding assertions to your programs. ++Mistakes and bad input are facts of life. An important part of programming is finding, diagnosing, and fixing bugs. Problems can become easier to notice if you have an automated test suite or add assertions to your programs. + +-Problems caused by factors outside the program's control should usually be handled gracefully. Sometimes, when the problem can be handled locally, special return values are a sane way to track them. Otherwise, exceptions are preferable. ++Problems caused by factors outside the program's control should usually be handled gracefully. Sometimes, when the problem can be handled locally, special return values are a good way to track them. Otherwise, exceptions may be preferable. + +-Throwing an exception causes the call stack to be unwound until the next enclosing `try/catch` block or until the bottom of the stack. The exception value will be given to the `catch` block that catches it, which should verify that it is actually the expected kind of exception and then do something with it. To deal with the unpredictable control flow caused by exceptions, `finally` blocks can be used to ensure a piece of code is _always_ run when a block finishes. ++Throwing an exception causes the call stack to be unwound until the next enclosing `try/catch` block or until the bottom of the stack. The exception value will be given to the `catch` block that catches it, which should verify that it is actually the expected kind of exception and then do something with it. To help address the unpredictable control flow caused by exceptions, `finally` blocks can be used to ensure that a piece of code _always_ runs when a block finishes. + + ## Exercises + + ### Retry + +-Say you have a function `primitiveMultiply` that, in 50 percent of cases, multiplies two numbers, and in the other 50 percent, raises an exception of type `MultiplicatorUnitFailure`. Write a function that wraps this clunky function and just keeps trying until a call succeeds, after which it returns the result. ++Say you have a function `primitiveMultiply` that, in 20 percent of cases, multiplies two numbers, and in the other 80 percent, raises an exception of type `MultiplicatorUnitFailure`. Write a function that wraps this clunky function and just keeps trying until a call succeeds, after which it returns the result. + + Make sure you handle only the exceptions you are trying to handle. + + ``` +-function MultiplicatorUnitFailure() {} ++class MultiplicatorUnitFailure extends Error {} + + function primitiveMultiply(a, b) { +- if (Math.random() < 0.5) ++ if (Math.random() < 0.2) { + return a * b; +- else +- throw new MultiplicatorUnitFailure(); ++ } else { ++ throw new MultiplicatorUnitFailure("Klunk"); ++ } + } + + function reliableMultiply(a, b) { +@@ -412,7 +430,7 @@ console.log(reliableMultiply(8, 8)); + // → 64 + ``` + +-The call to `primitiveMultiply` should obviously happen in a `try` block. The corresponding `catch` block should rethrow the exception when it is not an instance of `MultiplicatorUnitFailure` and ensure the call is retried when it is. ++The call to `primitiveMultiply` should definitely happen in a `try` block. The corresponding `catch` block should rethrow the exception when it is not an instance of `MultiplicatorUnitFailure` and ensure the call is retried when it is. + + To do the retrying, you can either use a loop that breaks only when a call succeeds—as in the [`look` example](08_error.html#look) earlier in this chapter—or use recursion and hope you don't get a string of failures so long that it overflows the stack (which is a pretty safe bet). + +@@ -421,10 +439,10 @@ To do the retrying, you can either use a loop that breaks only when a call succe + Consider the following (rather contrived) object: + + ``` +-var box = { ++const box = { + locked: true, +- unlock: function() { this.locked = false; }, +- lock: function() { this.locked = true; }, ++ unlock() { this.locked = false; }, ++ lock() { this.locked = true; }, + _content: [], + get content() { + if (this.locked) throw new Error("Locked!"); +@@ -433,11 +451,22 @@ var box = { + }; + ``` + +-It is a box with a lock. Inside is an array, but you can get at it only when the box is unlocked. Directly accessing the `_content` property is not allowed. ++It is a box with a lock. There is an array in the box, but you can get at it only when the box is unlocked. Directly accessing the private `_content` property is forbidden. + + Write a function called `withBoxUnlocked` that takes a function value as argument, unlocks the box, runs the function, and then ensures that the box is locked again before returning, regardless of whether the argument function returned normally or threw an exception. + + ``` ++const box = { ++ locked: true, ++ unlock() { this.locked = false; }, ++ lock() { this.locked = true; }, ++ _content: [], ++ get content() { ++ if (this.locked) throw new Error("Locked!"); ++ return this._content; ++ } ++}; ++ + function withBoxUnlocked(body) { + // Your code here. + } +@@ -459,6 +488,6 @@ console.log(box.locked); + + For extra points, make sure that if you call `withBoxUnlocked` when the box is already unlocked, the box stays unlocked. + +-This exercise calls for a `finally` block, as you probably guessed. Your function should first unlock the box and then call the argument function from inside a `try` body. The `finally` block after it should lock the box again. ++This exercise calls for a `finally` block. Your function should first unlock the box and then call the argument function from inside a `try` body. The `finally` block after it should lock the box again. + + To make sure we don't lock the box when it wasn't already locked, check its lock at the start of the function and unlock and lock it only when it started out locked. diff --git a/diff-en/2ech9-3ech9.diff b/diff-en/2ech9-3ech9.diff new file mode 100644 index 0000000..087b1d6 --- /dev/null +++ b/diff-en/2ech9-3ech9.diff @@ -0,0 +1,787 @@ +diff --git a/2ech9.md b/3ech9.md +index cbbeb4e..7a67f2e 100644 +--- a/2ech9.md ++++ b/3ech9.md +@@ -4,37 +4,35 @@ + > + > <footer>Jamie Zawinski</footer> + +-> Yuan-Ma said, ‘When you cut against the grain of the wood, much strength is needed. When you program against the grain of a problem, much code is needed.' ++> Yuan-Ma said, ‘When you cut against the grain of the wood, much strength is needed. When you program against the grain of the problem, much code is needed.' + > + > <footer>Master Yuan-Ma, <cite>The Book of Programming</cite></footer> + +-Programming tools and techniques survive and spread in a chaotic, evolutionary way. It's not always the pretty or brilliant ones that win but rather the ones that function well enough within the right niche—for example, by being integrated with another successful piece of technology. ++Programming tools and techniques survive and spread in a chaotic, evolutionary way. It's not always the pretty or brilliant ones that win but rather the ones that function well enough within the right niche or happen to be integrated with another successful piece of technology. + +-In this chapter, I will discuss one such tool, _regular expressions_. Regular expressions are a way to describe patterns in string data. They form a small, separate language that is part of JavaScript and many other languages and tools. ++In this chapter, I will discuss one such tool, _regular expressions_. Regular expressions are a way to describe patterns in string data. They form a small, separate language that is part of JavaScript and many other languages and systems. + + Regular expressions are both terribly awkward and extremely useful. Their syntax is cryptic, and the programming interface JavaScript provides for them is clumsy. But they are a powerful tool for inspecting and processing strings. Properly understanding regular expressions will make you a more effective programmer. + + ## Creating a regular expression + +-A regular expression is a type of object. It can either be constructed with the `RegExp` constructor or written as a literal value by enclosing the pattern in forward slash (`/`) characters. ++A regular expression is a type of object. It can either be constructed with the `RegExp` constructor or written as a literal value by enclosing a pattern in forward slash (`/`) characters. + + ``` +-var re1 = new RegExp("abc"); +-var re2 = /abc/; ++let re1 = new RegExp("abc"); ++let re2 = /abc/; + ``` + +-Both of these regular expression objects represent the same pattern: an _a_ character followed by a _b_ followed by a _c_. ++Both of those regular expression objects represent the same pattern: an _a_ character followed by a _b_ followed by a _c_. + + When using the `RegExp` constructor, the pattern is written as a normal string, so the usual rules apply for backslashes. + + The second notation, where the pattern appears between slash characters, treats backslashes somewhat differently. First, since a forward slash ends the pattern, we need to put a backslash before any forward slash that we want to be _part_ of the pattern. In addition, backslashes that aren't part of special character codes (like `\n`) will be _preserved_, rather than ignored as they are in strings, and change the meaning of the pattern. Some characters, such as question marks and plus signs, have special meanings in regular expressions and must be preceded by a backslash if they are meant to represent the character itself. + + ``` +-var eighteenPlus = /eighteen\+/; ++let eighteenPlus = /eighteen\+/; + ``` + +-Knowing precisely what characters to backslash-escape when writing regular expressions requires you to know every character with a special meaning. For the time being, this may not be realistic, so when in doubt, just put a backslash before any character that is not a letter, number, or whitespace. +- + ## Testing for matches + + Regular expression objects have a number of methods. The simplest one is `test`. If you pass it a string, it will return a Boolean telling you whether the string contains a match of the pattern in the expression. +@@ -48,9 +46,9 @@ console.log(/abc/.test("abxde")); + + A regular expression consisting of only nonspecial characters simply represents that sequence of characters. If _abc_ occurs anywhere in the string we are testing against (not just at the start), `test` will return `true`. + +-## Matching a set of characters ++## Sets of characters + +-Finding out whether a string contains _abc_ could just as well be done with a call to `indexOf`. Regular expressions allow us to go beyond that and express more complicated patterns. ++Finding out whether a string contains _abc_ could just as well be done with a call to `indexOf`. Regular expressions allow us to express more complicated patterns. + + Say we want to match any number. In a regular expression, putting a set of characters between square brackets makes that part of the expression match any of the characters between the brackets. + +@@ -65,7 +63,7 @@ console.log(/[0-9]/.test("in 1992")); + + Within square brackets, a dash (`-`) between two characters can be used to indicate a range of characters, where the ordering is determined by the character's Unicode number. Characters 0 to 9 sit right next to each other in this ordering (codes 48 to 57), so `[0-9]` covers all of them and matches any digit. + +-There are a number of common character groups that have their own built-in shortcuts. Digits are one of them: `\d` means the same thing as `[0-9]`. ++A number of common character groups have their own built-in shortcuts. Digits are one of them: `\d` means the same thing as `[0-9]`. + + | `\d` | Any digit character | + | `\w` | An alphanumeric character (“word character”) | +@@ -78,21 +76,21 @@ There are a number of common character groups that have their own built-in short + So you could match a date and time format like 30-01-2003 15:20 with the following expression: + + ``` +-var dateTime = /\d\d-\d\d-\d\d\d\d \d\d:\d\d/; ++let dateTime = /\d\d-\d\d-\d\d\d\d \d\d:\d\d/; + console.log(dateTime.test("30-01-2003 15:20")); + // → true + console.log(dateTime.test("30-jan-2003 15:20")); + // → false + ``` + +-That looks completely awful, doesn't it? It has way too many backslashes, producing background noise that makes it hard to spot the actual pattern expressed. We'll see a slightly improved version of this expression [later](09_regexp.html#date_regexp_counted). ++That looks completely awful, doesn't it? Half of it is backslashes, producing a background noise that makes it hard to spot the actual pattern expressed. We'll see a slightly improved version of this expression [later](09_regexp.html#date_regexp_counted). + +-These backslash codes can also be used inside square brackets. For example, `[\d.]` means any digit or a period character. But note that the period itself, when used between square brackets, loses its special meaning. The same goes for other special characters, such as `+`. ++These backslash codes can also be used inside square brackets. For example, `[\d.]` means any digit or a period character. But the period itself, between square brackets, loses its special meaning. The same goes for other special characters, such as `+`. + + To _invert_ a set of characters—that is, to express that you want to match any character _except_ the ones in the set—you can write a caret (`^`) character after the opening bracket. + + ``` +-var notBinary = /[^01]/; ++let notBinary = /[^01]/; + console.log(notBinary.test("1100100010100110")); + // → false + console.log(notBinary.test("1100100010200110")); +@@ -118,10 +116,10 @@ console.log(/'\d*'/.test("''")); + + The star (`*`) has a similar meaning but also allows the pattern to match zero times. Something with a star after it never prevents a pattern from matching—it'll just match zero instances if it can't find any suitable text to match. + +-A question mark makes a part of a pattern “optional”, meaning it may occur zero or one time. In the following example, the _u_ character is allowed to occur, but the pattern also matches when it is missing. ++A question mark makes a part of a pattern _optional_, meaning it may occur zero times or one time. In the following example, the _u_ character is allowed to occur, but the pattern also matches when it is missing. + + ``` +-var neighbor = /neighbou?r/; ++let neighbor = /neighbou?r/; + console.log(neighbor.test("neighbour")); + // → true + console.log(neighbor.test("neighbor")); +@@ -130,36 +128,36 @@ console.log(neighbor.test("neighbor")); + + To indicate that a pattern should occur a precise number of times, use curly braces. Putting `{4}` after an element, for example, requires it to occur exactly four times. It is also possible to specify a range this way: `{2,4}` means the element must occur at least twice and at most four times. + +-Here is another version of the date and time pattern that allows both single- and double-digit days, months, and hours. It is also slightly more readable. ++Here is another version of the date and time pattern that allows both single- and double-digit days, months, and hours. It is also slightly easier to decipher. + + ``` +-var dateTime = /\d{1,2}-\d{1,2}-\d{4} \d{1,2}:\d{2}/; ++let dateTime = /\d{1,2}-\d{1,2}-\d{4} \d{1,2}:\d{2}/; + console.log(dateTime.test("30-1-2003 8:45")); + // → true + ``` + +-You can also specify open-ended ranges when using curly braces by omitting the number after the comma. So `{5,}` means five or more times. ++You can also specify open-ended ranges when using curly braces by omitting the number after the comma. So, `{5,}` means five or more times. + + ## Grouping subexpressions + +-To use an operator like `*` or `+` on more than one element at a time, you can use parentheses. A part of a regular expression that is enclosed in parentheses counts as a single element as far as the operators following it are concerned. ++To use an operator like `*` or `+` on more than one element at a time, you have to use parentheses. A part of a regular expression that is enclosed in parentheses counts as a single element as far as the operators following it are concerned. + + ``` +-var cartoonCrying = /boo+(hoo+)+/i; ++let cartoonCrying = /boo+(hoo+)+/i; + console.log(cartoonCrying.test("Boohoooohoohooo")); + // → true + ``` + + The first and second `+` characters apply only to the second _o_ in _boo_ and _hoo_, respectively. The third `+` applies to the whole group `(hoo+)`, matching one or more sequences like that. + +-The `i` at the end of the expression in the previous example makes this regular expression case insensitive, allowing it to match the uppercase _B_ in the input string, even though the pattern is itself all lowercase. ++The `i` at the end of the expression in the example makes this regular expression case insensitive, allowing it to match the uppercase _B_ in the input string, even though the pattern is itself all lowercase. + + ## Matches and groups + + The `test` method is the absolute simplest way to match a regular expression. It tells you only whether it matched and nothing else. Regular expressions also have an `exec` (execute) method that will return `null` if no match was found and return an object with information about the match otherwise. + + ``` +-var match = /\d+/.exec("one two 100"); ++let match = /\d+/.exec("one two 100"); + console.log(match); + // → ["100"] + console.log(match.index); +@@ -178,7 +176,7 @@ console.log("one two 100".match(/\d+/)); + When the regular expression contains subexpressions grouped with parentheses, the text that matched those groups will also show up in the array. The whole match is always the first element. The next element is the part matched by the first group (the one whose opening parenthesis comes first in the expression), then the second group, and so on. + + ``` +-var quotedText = /'([^']*)'/; ++let quotedText = /'([^']*)'/; + console.log(quotedText.exec("she said 'hello'")); + // → ["'hello'", "hello"] + ``` +@@ -194,15 +192,15 @@ console.log(/(\d)+/.exec("123")); + + Groups can be useful for extracting parts of a string. If we don't just want to verify whether a string contains a date but also extract it and construct an object that represents it, we can wrap parentheses around the digit patterns and directly pick the date out of the result of `exec`. + +-But first, a brief detour, in which we discuss the preferred way to store date and time values in JavaScript. ++But first, a brief detour, in which we discuss the built-in way to represent date and time values in JavaScript. + +-## The date type ++## The Date class + +-JavaScript has a standard object type for representing dates—or rather, points in time. It is called `Date`. If you simply create a date object using `new`, you get the current date and time. ++JavaScript has a standard class for representing dates—or rather, points in time. It is called `Date`. If you simply create a date object using `new`, you get the current date and time. + + ``` + console.log(new Date()); +-// → Wed Dec 04 2013 14:24:57 GMT+0100 (CET) ++// → Mon Nov 13 2017 16:19:11 GMT+0100 (CET) + ``` + + You can also create an object for a specific time. +@@ -218,7 +216,7 @@ JavaScript uses a convention where month numbers start at zero (so December is 1 + + The last four arguments (hours, minutes, seconds, and milliseconds) are optional and taken to be zero when not given. + +-Timestamps are stored as the number of milliseconds since the start of 1970, using negative numbers for times before 1970 (following a convention set by “Unix time”, which was invented around that time). The `getTime` method on a date object returns this number. It is big, as you can imagine. ++Timestamps are stored as the number of milliseconds since the start of 1970, in the UTC time zone. This follows a convention set by “Unix time”, which was invented around that time. You can use negative numbers for times before 1970\. The `getTime` method on a date object returns this number. It is big, as you can imagine. + + ``` + console.log(new Date(2013, 11, 19).getTime()); +@@ -227,29 +225,29 @@ console.log(new Date(1387407600000)); + // → Thu Dec 19 2013 00:00:00 GMT+0100 (CET) + ``` + +-If you give the `Date` constructor a single argument, that argument is treated as such a millisecond count. You can get the current millisecond count by creating a new `Date` object and calling `getTime` on it but also by calling the `Date.now` function. ++If you give the `Date` constructor a single argument, that argument is treated as such a millisecond count. You can get the current millisecond count by creating a new `Date` object and calling `getTime` on it or by calling the `Date.now` function. + +-Date objects provide methods like `getFullYear`, `getMonth`, `getDate`, `getHours`, `getMinutes`, and `getSeconds` to extract their components. There's also `getYear`, which gives you a rather useless two-digit year value (such as `93` or `14`). ++Date objects provide methods like `getFullYear`, `getMonth`, `getDate`, `getHours`, `getMinutes`, and `getSeconds` to extract their components. Besides `getFullYear`, there's also `getYear`, which gives you a rather useless two-digit year value (such as `93` or `14`). + +-Putting parentheses around the parts of the expression that we are interested in, we can now easily create a date object from a string. ++Putting parentheses around the parts of the expression that we are interested in, we can now create a date object from a string. + + ``` +-function findDate(string) { +- var dateTime = /(\d{1,2})-(\d{1,2})-(\d{4})/; +- var match = dateTime.exec(string); +- return new Date(Number(match[3]), +- Number(match[2]) - 1, +- Number(match[1])); ++function getDate(string) { ++ let [_, day, month, year] = ++ /(\d{1,2})-(\d{1,2})-(\d{4})/.exec(string); ++ return new Date(year, month - 1, day); + } +-console.log(findDate("30-1-2003")); ++console.log(getDate("30-1-2003")); + // → Thu Jan 30 2003 00:00:00 GMT+0100 (CET) + ``` + ++The `_` (underscore) binding is ignored, and only used to skip the full match element in the array returned by `exec`. ++ + ## Word and string boundaries + +-Unfortunately, `findDate` will also happily extract the nonsensical date 00-1-3000 from the string `"100-1-30000"`. A match may happen anywhere in the string, so in this case, it'll just start at the second character and end at the second-to-last character. ++Unfortunately, `getDate` will also happily extract the nonsensical date 00-1-3000 from the string `"100-1-30000"`. A match may happen anywhere in the string, so in this case, it'll just start at the second character and end at the second-to-last character. + +-If we want to enforce that the match must span the whole string, we can add the markers `^` and `![Visualization of /\b\d+ (pig|cow|chicken)s?\b/](img/re_pigchickens.svg) + +-Our expression matches a string if we can find a path from the left side of the diagram to the right side. We keep a current position in the string, and every time we move through a box, we verify that the part of the string after our current position matches that box. ++Our expression matches if we can find a path from the left side of the diagram to the right side. We keep a current position in the string, and every time we move through a box, we verify that the part of the string after our current position matches that box. + +-So if we try to match `"the 3 pigs"` with our regular expression, our progress through the flow chart would look like this: ++So if we try to match `"the 3 pigs"` from position 4, our progress through the flow chart would look like this: + + * At position 4, there is a word boundary, so we can move past the first box. + +@@ -300,17 +300,15 @@ So if we try to match `"the 3 pigs"` with our regular expression, our progress t + + * We're at position 10 (the end of the string) and can match only a word boundary. The end of a string counts as a word boundary, so we go through the last box and have successfully matched this string. + +-Conceptually, a regular expression engine looks for a match in a string as follows: it starts at the start of the string and tries a match there. In this case, there _is_ a word boundary there, so it'd get past the first box—but there is no digit, so it'd fail at the second box. Then it moves on to the second character in the string and tries to begin a new match there... and so on, until it finds a match or reaches the end of the string and decides that there really is no match. +- + ## Backtracking + +-The regular expression `/\b([01]+b|\d+|[\da-f]+h)\b/` matches either a binary number followed by a _b_, a regular decimal number with no suffix character, or a hexadecimal number (that is, base 16, with the letters _a_ to _f_ standing for the digits 10 to 15) followed by an _h_. This is the corresponding diagram: ++The regular expression `/<wbr>\b([01]+b|[\da-f]+h|\d+)\b/<wbr>` matches either a binary number followed by a _b_, a hexadecimal number (that is, base 16, with the letters _a_ to _f_ standing for the digits 10 to 15) followed by an _h_, or a regular decimal number with no suffix character. This is the corresponding diagram: + +-![Visualization of /\b([01]+b|\d+|[\da-f]+h)\b/](img/re_number.svg) ++
![Visualization of /\b([01]+b|\d+|[\da-f]+h)\b/](img/re_number.svg)
+ + When matching this expression, it will often happen that the top (binary) branch is entered even though the input does not actually contain a binary number. When matching the string `"103"`, for example, it becomes clear only at the 3 that we are in the wrong branch. The string _does_ match the expression, just not the branch we are currently in. + +-So the matcher _backtracks_. When entering a branch, it remembers its current position (in this case, at the start of the string, just past the first boundary box in the diagram) so that it can go back and try another branch if the current one does not work out. For the string `"103"`, after encountering the 3 character, it will start trying the branch for decimal numbers. This one matches, so a match is reported after all. ++So the matcher _backtracks_. When entering a branch, it remembers its current position (in this case, at the start of the string, just past the first boundary box in the diagram) so that it can go back and try another branch if the current one does not work out. For the string `"103"`, after encountering the 3 character, it will start trying the branch for hexadecimal numbers, which fails again because there is no _h_ after the number. So it tries the decimal number branch. This one fits, and a match is reported after all. + + The matcher stops as soon as it finds a full match. This means that if multiple branches could potentially match a string, only the first one (ordered by where the branches appear in the regular expression) is used. + +@@ -318,13 +316,13 @@ Backtracking also happens for repetition operators like + and `*`. If you match + + It is possible to write regular expressions that will do a _lot_ of backtracking. This problem occurs when a pattern can match a piece of input in many different ways. For example, if we get confused while writing a binary-number regular expression, we might accidentally write something like `/([01]+)+b/`. + +-![Visualization of /([01]+)+b/](img/re_slow.svg) ++
![Visualization of /([01]+)+b/](img/re_slow.svg)
+ +-If that tries to match some long series of zeros and ones with no trailing _b_ character, the matcher will first go through the inner loop until it runs out of digits. Then it notices there is no _b_, so it backtracks one position, goes through the outer loop once, and gives up again, trying to backtrack out of the inner loop once more. It will continue to try every possible route through these two loops. This means the amount of work _doubles_ with each additional character. For even just a few dozen characters, the resulting match will take practically forever. ++If that tries to match some long series of zeros and ones with no trailing _b_ character, the matcher first goes through the inner loop until it runs out of digits. Then it notices there is no _b_, so it backtracks one position, goes through the outer loop once, and gives up again, trying to backtrack out of the inner loop once more. It will continue to try every possible route through these two loops. This means the amount of work _doubles_ with each additional character. For even just a few dozen characters, the resulting match will take practically forever. + + ## The replace method + +-String values have a `replace` method, which can be used to replace part of the string with another string. ++String values have a `replace` method that can be used to replace part of the string with another string. + + ``` + console.log("papa".replace("p", "m")); +@@ -342,41 +340,41 @@ console.log("Borobudur".replace(/[ou]/g, "a")); + + It would have been sensible if the choice between replacing one match or all matches was made through an additional argument to `replace` or by providing a different method, `replaceAll`. But for some unfortunate reason, the choice relies on a property of the regular expression instead. + +-The real power of using regular expressions with `replace` comes from the fact that we can refer back to matched groups in the replacement string. For example, say we have a big string containing the names of people, one name per line, in the format `Lastname, Firstname`. If we want to swap these names and remove the comma to get a simple `Firstname Lastname` format, we can use the following code: ++The real power of using regular expressions with `replace` comes from the fact that we can refer back to matched groups in the replacement string. For example, say we have a big string containing the names of people, one name per line, in the format `Lastname, Firstname`. If we want to swap these names and remove the comma to get a `Firstname Lastname` format, we can use the following code: + + ``` + console.log( +- "Hopper, Grace\nMcCarthy, John\nRitchie, Dennis" +- .replace(/([\w ]+), ([\w ]+)/g, "$2 $1")); +-// → Grace Hopper ++ "Liskov, Barbara\nMcCarthy, John\nWadler, Philip" ++ .replace(/(\w+), (\w+)/g, "$2 $1")); ++// → Barbara Liskov + // John McCarthy +-// Dennis Ritchie ++// Philip Wadler + ``` + + The `$1` and `$2` in the replacement string refer to the parenthesized groups in the pattern. `$1` is replaced by the text that matched against the first group, `$2` by the second, and so on, up to `$9`. The whole match can be referred to with `><`. + +-It is also possible to pass a function, rather than a string, as the second argument to `replace`. For each replacement, the function will be called with the matched groups (as well as the whole match) as arguments, and its return value will be inserted into the new string. ++It is possible to pass a function—rather than a string—as the second argument to `replace`. For each replacement, the function will be called with the matched groups (as well as the whole match) as arguments, and its return value will be inserted into the new string. + +-Here's a simple example: ++Here's a small example: + + ``` +-var s = "the cia and fbi"; +-console.log(s.replace(/\b(fbi|cia)\b/g, function(str) { +- return str.toUpperCase(); +-})); ++let s = "the cia and fbi"; ++console.log(s.replace(/\b(fbi|cia)\b/g, ++ str => str.toUpperCase())); + // → the CIA and FBI + ``` + + And here's a more interesting one: + + ``` +-var stock = "1 lemon, 2 cabbages, and 101 eggs"; ++let stock = "1 lemon, 2 cabbages, and 101 eggs"; + function minusOne(match, amount, unit) { + amount = Number(amount) - 1; +- if (amount == 1) // only one left, remove the 's' ++ if (amount == 1) { // only one left, remove the 's' + unit = unit.slice(0, unit.length - 1); +- else if (amount == 0) ++ } else if (amount == 0) { + amount = "no"; ++ } + return amount + " " + unit; + } + console.log(stock.replace(/(\d+) (\w+)/g, minusOne)); +@@ -389,7 +387,7 @@ The `(\d+)` group ends up as the `amount` argument to the function, and the `(\w + + ## Greed + +-It isn't hard to use `replace` to write a function that removes all comments from a piece of JavaScript code. Here is a first attempt: ++It is possible to use `replace` to write a function that removes all comments from a piece of JavaScript code. Here is a first attempt: + + ``` + function stripComments(code) { +@@ -403,9 +401,9 @@ console.log(stripComments("1 /* a */+/* b */ 1")); + // → 1 1 + ``` + +-The part before the _or_ operator simply matches two slash characters followed by any number of non-newline characters. The part for multiline comments is more involved. We use `[^]` (any character that is not in the empty set of characters) as a way to match any character. We cannot just use a dot here because block comments can continue on a new line, and dots do not match the newline character. ++The part before the _or_ operator matches two slash characters followed by any number of non-newline characters. The part for multiline comments is more involved. We use `[^]` (any character that is not in the empty set of characters) as a way to match any character. We cannot just use a period here because block comments can continue on a new line, and the period character does not match newline characters. + +-But the output of the previous example appears to have gone wrong. Why? ++But the output for the last line appears to have gone wrong. Why? + + The `[^]*` part of the expression, as I described in the section on backtracking, will first match as much as it can. If that causes the next part of the pattern to fail, the matcher moves back one character and tries again from there. In the example, the matcher first tries to match the whole rest of the string and then moves back from there. It will find an occurrence of `*/` after going back four characters and match that. This is not what we wanted—the intention was to match a single comment, not to go all the way to the end of the code and find the end of the last block comment. + +@@ -430,31 +428,31 @@ There are cases where you might not know the exact pattern you need to match aga + But you can build up a string and use the `RegExp` constructor on that. Here's an example: + + ``` +-var name = "harry"; +-var text = "Harry is a suspicious character."; +-var regexp = new RegExp("\\b(" + name + ")\\b", "gi"); ++let name = "harry"; ++let text = "Harry is a suspicious character."; ++let regexp = new RegExp("\\b(" + name + ")\\b", "gi"); + console.log(text.replace(regexp, "_$1_")); + // → _Harry_ is a suspicious character. + ``` + +-When creating the `\b` boundary markers, we have to use two backslashes because we are writing them in a normal string, not a slash-enclosed regular expression. The second argument to the `RegExp` constructor contains the options for the regular expression—in this case `"gi"` for global and case-insensitive. ++When creating the `\b` boundary markers, we have to use two backslashes because we are writing them in a normal string, not a slash-enclosed regular expression. The second argument to the `RegExp` constructor contains the options for the regular expression—in this case, `"gi"` for global and case-insensitive. + +-But what if the name is `"dea+hl[]rd"` because our user is a nerdy teenager? That would result in a nonsensical regular expression, which won't actually match the user's name. ++But what if the name is `"dea+hl[]rd"` because our user is a nerdy teenager? That would result in a nonsensical regular expression that won't actually match the user's name. + +-To work around this, we can add backslashes before any character that we don't trust. Adding backslashes before alphabetic characters is a bad idea because things like `\b` and `\n` have a special meaning. But escaping everything that's not alphanumeric or whitespace is safe. ++To work around this, we can add backslashes before any character that has a special meaning. + + ``` +-var name = "dea+hl[]rd"; +-var text = "This dea+hl[]rd guy is super annoying."; +-var escaped = name.replace(/[^\w\s]/g, "\\><"); +-var regexp = new RegExp("\\b(" + escaped + ")\\b", "gi"); +-console.log(text.replace(regexp, "_$1_")); ++let name = "dea+hl[]rd"; ++let text = "This dea+hl[]rd guy is super annoying."; ++let escaped = name.replace(/[\\[.+*?(){|^$]/g, "\\><"); ++let regexp = new RegExp("\\b" + escaped + "\\b", "gi"); ++console.log(text.replace(regexp, "_><_")); + // → This _dea+hl[]rd_ guy is super annoying. + ``` + + ## The search method + +-The `indexOf` method on strings cannot be called with a regular expression. But there is another method, `search`, which does expect a regular expression. Like `indexOf`, it returns the first index on which the expression was found, or -1 when it wasn't found. ++The `indexOf` method on strings cannot be called with a regular expression. But there is another method, `search`, that does expect a regular expression. Like `indexOf`, it returns the first index on which the expression was found, or -1 when it wasn't found. + + ``` + console.log(" word".search(/\S/)); +@@ -471,12 +469,12 @@ The `exec` method similarly does not provide a convenient way to start searching + + Regular expression objects have properties. One such property is `source`, which contains the string that expression was created from. Another property is `lastIndex`, which controls, in some limited circumstances, where the next match will start. + +-Those circumstances are that the regular expression must have the global (`g`) option enabled, and the match must happen through the `exec` method. Again, a more sane solution would have been to just allow an extra argument to be passed to `exec`, but sanity is not a defining characteristic of JavaScript's regular expression interface. ++Those circumstances are that the regular expression must have the global (`g`) or sticky (`y`) option enabled, and the match must happen through the `exec` method. Again, a less confusing solution would have been to just allow an extra argument to be passed to `exec`, but confusion is an essential feature of JavaScript's regular expression interface. + + ``` +-var pattern = /y/g; ++let pattern = /y/g; + pattern.lastIndex = 3; +-var match = pattern.exec("xyzzy"); ++let match = pattern.exec("xyzzy"); + console.log(match.index); + // → 4 + console.log(pattern.lastIndex); +@@ -485,10 +483,21 @@ console.log(pattern.lastIndex); + + If the match was successful, the call to `exec` automatically updates the `lastIndex` property to point after the match. If no match was found, `lastIndex` is set back to zero, which is also the value it has in a newly constructed regular expression object. + +-When using a global regular expression value for multiple `exec` calls, these automatic updates to the `lastIndex` property can cause problems. Your regular expression might be accidentally starting at an index that was left over from a previous call. ++The difference between the global and the sticky options is that, when sticky is enabled, the match will only succeed if it starts directly at `lastIndex`, whereas with global, it will search ahead for a position where a match can start. + + ``` +-var digit = /\d/g; ++let global = /abc/g; ++console.log(global.exec("xyz abc")); ++// → ["abc"] ++let sticky = /abc/y; ++console.log(sticky.exec("xyz abc")); ++// → null ++``` ++ ++When using a shared regular expression value for multiple `exec` calls, these automatic updates to the `lastIndex` property can cause problems. Your regular expression might be accidentally starting at an index that was left over from a previous call. ++ ++``` ++let digit = /\d/g; + console.log(digit.exec("here it is: 1")); + // → ["1"] + console.log(digit.exec("and now: 1")); +@@ -506,27 +515,28 @@ So be cautious with global regular expressions. The cases where they are necessa + + ### Looping over matches + +-A common pattern is to scan through all occurrences of a pattern in a string, in a way that gives us access to the match object in the loop body, by using `lastIndex` and `exec`. ++A common thing to do is to scan through all occurrences of a pattern in a string, in a way that gives us access to the match object in the loop body. We can do this by using `lastIndex` and `exec`. + + ``` +-var input = "A string with 3 numbers in it... 42 and 88."; +-var number = /\b(\d+)\b/g; +-var match; +-while (match = number.exec(input)) +- console.log("Found", match[1], "at", match.index); ++let input = "A string with 3 numbers in it... 42 and 88."; ++let number = /\b\d+\b/g; ++let match; ++while (match = number.exec(input)) { ++ console.log("Found", match[0], "at", match.index); ++} + // → Found 3 at 14 + // Found 42 at 33 + // Found 88 at 40 + ``` + +-This makes use of the fact that the value of an assignment expression (`=`) is the assigned value. So by using `match = number.exec(input)` as the condition in the `while` statement, we perform the match at the start of each iteration, save its result in a variable, and stop looping when no more matches are found. ++This makes use of the fact that the value of an assignment expression (`=`) is the assigned value. So by using `match = number.<wbr>exec(input)` as the condition in the `while` statement, we perform the match at the start of each iteration, save its result in a binding, and stop looping when no more matches are found. + + ## Parsing an INI file + +-To conclude the chapter, we'll look at a problem that calls for regular expressions. Imagine we are writing a program to automatically harvest information about our enemies from the Internet. (We will not actually write that program here, just the part that reads the configuration file. Sorry to disappoint.) The configuration file looks like this: ++To conclude the chapter, we'll look at a problem that calls for regular expressions. Imagine we are writing a program to automatically collect information about our enemies from the Internet. (We will not actually write that program here, just the part that reads the configuration file. Sorry.) The configuration file looks like this: + + ``` +-searchengine=http://www.google.com/search?q=$1 ++searchengine=https://duckduckgo.com/?q=$1 + spitefulness=9.7 + + ; comments are preceded by a semicolon... +@@ -536,13 +546,13 @@ fullname=Larry Doe + type=kindergarten bully + website=http://www.geocities.com/CapeCanaveral/11451 + +-[gargamel] +-fullname=Gargamel +-type=evil sorcerer +-outputdir=/home/marijn/enemies/gargamel ++[davaeorn] ++fullname=Davaeorn ++type=evil wizard ++outputdir=/home/marijn/enemies/davaeorn + ``` + +-The exact rules for this format (which is actually a widely used format, usually called an _INI_ file) are as follows: ++The exact rules for this format (which is a widely used format, usually called an _INI_ file) are as follows: + + * Blank lines and lines starting with semicolons are ignored. + +@@ -552,58 +562,84 @@ The exact rules for this format (which is actually a widely used format, usually + + * Anything else is invalid. + +-Our task is to convert a string like this into an array of objects, each with a `name` property and an array of settings. We'll need one such object for each section and one for the global settings at the top. ++Our task is to convert a string like this into an object whose properties hold strings for sectionless settings and sub-objects for sections, with those sub-objects holding the section's settings. + +-Since the format has to be processed line by line, splitting up the file into separate lines is a good start. We used `string.split("\n")` to do this in [Chapter 6](06_object.html#split). Some operating systems, however, use not just a newline character to separate lines but a carriage return character followed by a newline (`"\r\n"`). Given that the `split` method also allows a regular expression as its argument, we can split on a regular expression like `/\r?\n/` to split in a way that allows both `"\n"` and `"\r\n"` between lines. ++Since the format has to be processed line by line, splitting up the file into separate lines is a good start. We used `string.<wbr>split("\n")` to do this in [Chapter 4](04_data.html#split). Some operating systems, however, use not just a newline character to separate lines but a carriage return character followed by a newline (`"\r\n"`). Given that the `split` method also allows a regular expression as its argument, we can use a regular expression like `/\r?\n/` to split in a way that allows both `"\n"` and `"\r\n"` between lines. + + ``` + function parseINI(string) { + // Start with an object to hold the top-level fields +- var currentSection = {name: null, fields: []}; +- var categories = [currentSection]; +- +- string.split(/\r?\n/).forEach(function(line) { +- var match; +- if (/^\s*(;.*)?$/.test(line)) { +- return; ++ let result = {}; ++ let section = result; ++ string.split(/\r?\n/).forEach(line => { ++ let match; ++ if (match = line.match(/^(\w+)=(.*)$/)) { ++ section[match[1]] = match[2]; + } else if (match = line.match(/^\[(.*)\]$/)) { +- currentSection = {name: match[1], fields: []}; +- categories.push(currentSection); +- } else if (match = line.match(/^(\w+)=(.*)$/)) { +- currentSection.fields.push({name: match[1], +- value: match[2]}); +- } else { +- throw new Error("Line '" + line + "' is invalid."); ++ section = result[match[1]] = {}; ++ } else if (!/^\s*(;.*)?$/.test(line)) { ++ throw new Error("Line '" + line + "' is not valid."); + } + }); +- +- return categories; ++ return result; + } +-``` + +-This code goes over every line in the file, updating the “current section” object as it goes along. First, it checks whether the line can be ignored, using the expression `/^\s*(;.*)?$/`. Do you see how it works? The part between the parentheses will match comments, and the `?` will make sure it also matches lines containing only whitespace. +- +-If the line is not a comment, the code then checks whether the line starts a new section. If so, it creates a new current section object, to which subsequent settings will be added. ++console.log(parseINI(` ++name=Vasilis ++[address] ++city=Tessaloniki`)); ++// → {name: "Vasilis", address: {city: "Tessaloniki"}} ++``` + +-The last meaningful possibility is that the line is a normal setting, which the code adds to the current section object. ++The code goes over the file's lines and builds up an object. Properties at the top are stored directly into that object, whereas properties found in sections are stored in a separate section object. The `section` binding points at the object for the current section. + +-If a line matches none of these forms, the function throws an error. ++There are two kinds of significant lines—section headers or property lines. When a line is a regular property, it is stored in the current section. When it is a section header, a new section object is created, and `section` is set to point at it. + + Note the recurring use of `^` and `
/.test("<🌹>")); ++// → false ++console.log(/<.>/u.test("<🌹>")); ++// → true ++``` ++ ++The problem is that the 🍎 in the first line is treated as two code units, and the `{3}` part is applied only to the second one. Similarly, the dot matches a single code unit, not the two that make up the rose emoji. ++ ++You must add a `u` option (for Unicode) to your regular expression to make it treat such characters properly. The wrong behavior remains the default, unfortunately, because changing that might cause problems for existing code that depends on it. ++ ++Though this was only just standardized and is, at the time of writing, not widely supported yet, it is possible to use `\p` in a regular expression (that must have the Unicode option enabled) to match all characters to which the Unicode standard assigns a given property. ++ ++``` ++console.log(/\p{Script=Greek}/u.test("α")); ++// → true ++console.log(/\p{Script=Arabic}/u.test("α")); ++// → false ++console.log(/\p{Alphabetic}/u.test("α")); ++// → true ++console.log(/\p{Alphabetic}/u.test("!")); ++// → false ++``` ++ ++Unicode defines a number of useful properties, though finding the one that you need may not always be trivial. You can use the `\p{Property=Value}` notation to match any character that has the given value for that property. If the property name is left off, as in `\p{Name}`, the name is assumed to either be a binary property such as `Alphabetic` or a category such as `Number`. + + ## Summary + +-Regular expressions are objects that represent patterns in strings. They use their own syntax to express these patterns. ++Regular expressions are objects that represent patterns in strings. They use their own language to express these patterns. + + | `/abc/` | A sequence of characters | + | `/[abc]/` | Any character from a set of characters | +@@ -613,7 +649,7 @@ Regular expressions are objects that represent patterns in strings. They use the + | `/x+?/` | One or more occurrences, nongreedy | + | `/x*/` | Zero or more occurrences | + | `/x?/` | Zero or one occurrence | +-| `/x{2,4}/` | Between two and four occurrences | ++| `/x{2,4}/` | Two to four occurrences | + | `/(abc)/` | A group | + | `/a|b|c/` | Any one of several patterns | + | `/\d/` | Any digit character | +@@ -624,15 +660,13 @@ Regular expressions are objects that represent patterns in strings. They use the + | `/^/` | Start of input | + | `/$/` | End of input | + +-A regular expression has a method `test` to test whether a given string matches it. It also has an `exec` method that, when a match is found, returns an array containing all matched groups. Such an array has an `index` property that indicates where the match started. +- +-Strings have a `match` method to match them against a regular expression and a `search` method to search for one, returning only the starting position of the match. Their `replace` method can replace matches of a pattern with a replacement string. Alternatively, you can pass a function to `replace`, which will be used to build up a replacement string based on the match text and matched groups. ++A regular expression has a method `test` to test whether a given string matches it. It also has a method `exec` that, when a match is found, returns an array containing all matched groups. Such an array has an `index` property that indicates where the match started. + +-Regular expressions can have options, which are written after the closing slash. The `i` option makes the match case insensitive, while the `g` option makes the expression _global_, which, among other things, causes the `replace` method to replace all instances instead of just the first. ++Strings have a `match` method to match them against a regular expression and a `search` method to search for one, returning only the starting position of the match. Their `replace` method can replace matches of a pattern with a replacement string or function. + +-The `RegExp` constructor can be used to create a regular expression value from a string. ++Regular expressions can have options, which are written after the closing slash. The `i` option makes the match case-insensitive. The `g` option makes the expression _global_, which, among other things, causes the `replace` method to replace all instances instead of just the first. The `y` option makes it sticky, which means that it will not search ahead and skip part of the string when looking for a match. The `u` option turns on Unicode mode, which fixes a number of problems around the handling of characters that take up two code units. + +-Regular expressions are a sharp tool with an awkward handle. They simplify some tasks tremendously but can quickly become unmanageable when applied to complex problems. Part of knowing how to use them is resisting the urge to try to shoehorn things that they cannot sanely express into them. ++Regular expressions are a sharp tool with an awkward handle. They simplify some tasks tremendously but can quickly become unmanageable when applied to complex problems. Part of knowing how to use them is resisting the urge to try to shoehorn things that they cannot cleanly express into them. + + ## Exercises + +@@ -652,11 +686,11 @@ For each of the following items, write a regular expression to test whether any + + 4. Any word ending in _ious_ + +-5. A whitespace character followed by a dot, comma, colon, or semicolon ++5. A whitespace character followed by a period, comma, colon, or semicolon + + 6. A word longer than six letters + +-7. A word without the letter _e_ ++7. A word without the letter _e_ (or _E_) + + Refer to the table in the [chapter summary](09_regexp.html#summary_regexp) for help. Test each solution with a few test strings. + +@@ -669,7 +703,7 @@ verify(/.../, + + verify(/.../, + ["pop culture", "mad props"], +- ["plop"]); ++ ["plop", "prrrop"]); + + verify(/.../, + ["ferret", "ferry", "ferrari"], +@@ -681,7 +715,7 @@ verify(/.../, + + verify(/.../, + ["bad punctuation ."], +- ["escape the dot"]); ++ ["escape the period"]); + + verify(/.../, + ["hottentottententen"], +@@ -689,19 +723,17 @@ verify(/.../, + + verify(/.../, + ["red platypus", "wobbling nest"], +- ["earth bed", "learning ape"]); ++ ["earth bed", "learning ape", "BEET"]); + + function verify(regexp, yes, no) { + // Ignore unfinished exercises + if (regexp.source == "...") return; +- yes.forEach(function(s) { +- if (!regexp.test(s)) +- console.log("Failure to match '" + s + "'"); +- }); +- no.forEach(function(s) { +- if (regexp.test(s)) +- console.log("Unexpected match for '" + s + "'"); +- }); ++ for (let str of yes) if (!regexp.test(str)) { ++ console.log(`Failure to match '${str}'`); ++ } ++ for (let str of no) if (regexp.test(str)) { ++ console.log(`Unexpected match for '${str}'`); ++ } + } + ``` + +@@ -712,7 +744,7 @@ Imagine you have written a story and used single quotation marks throughout to m + Think of a pattern that distinguishes these two kinds of quote usage and craft a call to the `replace` method that does the proper replacement. + + ``` +-var text = "'I'm the cook,' he said, 'it's my job.'"; ++let text = "'I'm the cook,' he said, 'it's my job.'"; + // Change this call. + console.log(text.replace(/A/g, "B")); + // → "I'm the cook," he said, "it's my job." +@@ -724,28 +756,28 @@ In addition, you must ensure that the replacement also includes the characters t + + ### Numbers again + +-A series of digits can be matched by the simple regular expression `/\d+/`. +- + Write an expression that matches only JavaScript-style numbers. It must support an optional minus _or_ plus sign in front of the number, the decimal dot, and exponent notation—`5e-3` or `1E10`— again with an optional sign in front of the exponent. Also note that it is not necessary for there to be digits in front of or after the dot, but the number cannot be a dot alone. That is, `.5` and `5.` are valid JavaScript numbers, but a lone dot _isn't_. + + ``` + // Fill in this regular expression. +-var number = /^...$/; ++let number = /^...$/; + + // Tests: +-["1", "-1", "+15", "1.55", ".5", "5.", "1.3e2", "1E-4", +- "1e+12"].forEach(function(s) { +- if (!number.test(s)) +- console.log("Failed to match '" + s + "'"); +-}); +-["1a", "+-1", "1.2.3", "1+1", "1e4.5", ".5.", "1f5", +- "."].forEach(function(s) { +- if (number.test(s)) +- console.log("Incorrectly accepted '" + s + "'"); +-}); +-``` +- +-First, do not forget the backslash in front of the dot. ++for (let str of ["1", "-1", "+15", "1.55", ".5", "5.", ++ "1.3e2", "1E-4", "1e+12"]) { ++ if (!number.test(str)) { ++ console.log(`Failed to match '${str}'`); ++ } ++} ++for (let str of ["1a", "+-1", "1.2.3", "1+1", "1e4.5", ++ ".5.", "1f5", "."]) { ++ if (number.test(str)) { ++ console.log(`Incorrectly accepted '${str}'`); ++ } ++} ++``` ++ ++First, do not forget the backslash in front of the period. + + Matching the optional sign in front of the number, as well as in front of the exponent, can be done with `[+\-]?` or `(\+|-|)` (plus, minus, or nothing). +