## 字幕列表 影片播放

• In this video, we'll introduce terms and notation that we'll use throughout this course.

在本視頻中，我們將介紹本課程中會用到的術語和符號。

讓我們從變量類型開始。

• We'll compare and contrast two pairs of variable types.

我們將對比兩對變量類型。

• Here's the first.

這是第一個。

• The first pair is response variable versus explanatory variable.

第一對是響應變量與解釋變量。

• The analyst is primarily interested in the response variable.

分析人員主要關注的是響應變量。

• We want to know if, and how, we can understand the response variable better using other variables.

我們想知道是否以及如何利用其他變量更好地理解響應變量。

• In the Commute and Chris setup, questions 1 and 2 have commute as the response variable.

在 "通勤 "和 "克里斯 "設置中，問題 1 和 2 將 "通勤 "作為響應變量。

• Chris hopes to understand how commute is affected by other variables.

克里斯希望瞭解其他變量對通勤的影響。

• The response variable goes by other names, like output variable and dependent variable.

響應變量還有其他名稱，如輸出變量和因變量。

• In contrast, an explanatory variable is any variable used to study the response variable.

相反，解釋變量是用於研究響應變量的任何變量。

• The goal here is to find potential relationships between the response variable and an explanatory variable.

這樣做的目的是找到響應變量與解釋變量之間的潛在關係。

• In the Commute and Chris setup, question 1 uses departure as an explanatory variable to analyze commute.

在通勤和克里斯的設置中，問題 1 使用出發作為解釋變量來分析通勤情況。

• We also call explanatory variables input variables or independent variables.

我們也稱解釋變量為輸入變量或自變量。

• Sometimes, we may even call them predictors or features.

有時，我們甚至可以稱它們為預測因子或特徵。

• Even though we can call response and explanatory variables dependent and independent variables, they are conceptually different from the independent random variables that we encountered in probability.

儘管我們可以把反應變量和解釋變量稱為因變量和自變量，但它們在概念上與我們在概率論中遇到的獨立隨機變量不同。

• Okay, that's our first pair.

好了，這是我們的第一對。

• Our second variable type pair is a quantitative variable versus a qualitative variable.

我們的第二對變量類型是定量變量與定性變量。

• As the name suggests, quantitative variables take on quantities, which we can divide into two main groups, which creates count variables and continuous variables.

顧名思義，定量變量具有數量，我們可以將其分為兩大類，即計數變量和連續變量。

• Count variables take on non-negative integers, while continuous variables take on values from an interval.

計數變量取值為非負整數，而連續變量取值為區間值。

• We commonly call qualitative variables as categorical variables.

我們通常將定性變量稱為分類變量。

• These variables take on a small number of possible categories, also known as classes or levels.

這些變量有少量可能的類別，也稱為類別或級別。

• It's common that we'll assign numbers to the categories, but this does not convert the variables into count variables.

我們通常會給類別分配數字，但這並不會將變量轉換為計數變量。

• Okay, let's review the commute and Chris setup and categorize the eight variables into count, continuous, and categorical variables.

好了，讓我們回顧一下通勤和克里斯的設置，並將八個變量分為計數變量、連續變量和分類變量。

• Commute is measured in minutes.

通勤時間以分鐘計算。

• Because any value exceeding zero is possible, commute is a continuous variable.

因為任何超過零的值都是可能的，所以通勤是一個連續變量。

• For similar reasons, departure, temp, and precip chance are also continuous variables.

出於類似的原因，偏離、溫度和降水概率也是連續變量。

• They just take on values from different intervals.

它們只是在不同的時間間隔內取值。

• Next, precip, season, and accident all take on two or four possible outcomes, making them categorical variables.

其次，降水、季節和事故都有兩種或四種可能的結果，是以是分類變量。

• Last is police, which takes on non-negative integers, making it a count variable.

最後是警察，它接受非負整數，是一個計數變量。

• We can subdivide categorical variables further into nominal and ordinal variables.

我們可以將分類變量進一步細分為名義變量和順序變量。

• If there's not a meaningful order to the categories, then it's a nominal variable.

如果分類沒有一個有意義的順序，那麼它就是一個名義變量。

• In the case where we assign numbers to categories, the numbers only act as labels.

在我們為類別分配數字的情況下，數字只起到標籤的作用。

• If there is a meaningful order to the categories, then it's an ordinal variable.

如果類別有一個有意義的順序，那麼它就是一個序數變量。

• In the case where numbers are assigned to the categories, the numbers communicate the order.

在為類別分配數字的情況下，數字表示順序。

• Let's use season from the commute and Chris setup as an example.

讓我們以通勤中的季節和克里斯的設置為例。

• If we assign 1 to winter, 2 to spring, 3 to summer, and 4 to fall, then the categories follow the calendar seasons in sequence and therefore have meaningful order.

如果我們把 1 指定為冬季，2 指定為春季，3 指定為夏季，4 指定為秋季，那麼這些類別就會按照日曆上的季節順序排列，是以就有了有意義的順序。

• This makes season an ordinal variable.

這使得季節成為一個順序變量。

• If instead we assign numbers based on alphabetical order of the seasons, then we do not have meaningful order to the categories.

如果我們根據季節的字母順序來分配數字，那麼我們的分類順序就沒有意義了。

• And this makes season a nominal variable.

這使得季節成為一個名義變量。

• Now that we've explored variable types, let's establish basic notation that we'll use throughout this course.

既然我們已經瞭解了變量類型，那麼我們就來建立本課程中將一直使用的基本符號。

• We denote variables in general by the letter x.

我們一般用字母 x 來表示變量。

• If there are multiple variables, we use the subscript j to distinguish between variables.

如果存在多個變量，我們使用下標 j 來區分變量。

• However, it's common to use the letter y to denote response variables.

不過，通常使用字母 y 來表示響應變量。

• We use p to represent the number of variables in a dataset, excluding the response variable if there is one.

我們用 p 表示數據集中的變量數量，如果有響應變量，則不包括響應變量。

• This means j can take on integer values from 1 to p.

這意味著 j 可以取 1 到 p 的整數值。

• For example, x sub 2 represents the second explanatory variable.

例如，x 子 2 代表第二個解釋變量。

• Now, what if we want to refer to a specific observation of a variable?

現在，如果我們想引用變量的某個具體觀測值，該怎麼辦？

• We use subscript i for this, and the letter n represents the total number of observations in the dataset.

我們使用下標 i 來表示，字母 n 代表數據集中的觀察結果總數。

• This means i can take on integer values from 1 to n.

這意味著 i 可以取 1 到 n 的整數值。

• Using the commute and Chris scenario as an example, the fifth observation contains values from the fifth recorded day.

以通勤和克里斯的情況為例，第五個觀測值包含第五個記錄日的值。

• These include y sub 5, the response variable data point recorded on that day, and x sub 5 comma 1 through x sub 5 comma p, data points for the explanatory variables from the same day.

其中包括 y sub 5（當天記錄的響應變量數據點）和 x sub 5 逗號 1 至 x sub 5 逗號 p（當天的解釋變量數據點）。

• But a word of caution about subscripts.

但關於下標，還是要提醒一下。

• When x has two numbers in its subscript, the first number is i, the second number is j, as we have shown.

當 x 的下標中有兩個數字時，第一個數字是 i，第二個數字是 j，如我們所示。

• However, if x has only one number in its subscript, it can be either i or j.

但是，如果 x 的下標只有一個數字，那麼它可以是 i 或 j。

• So, how can we identify which is which?

那麼，我們怎樣才能識別哪個是哪個呢？

• Well, it depends on the context.

這要看具體情況。

• Make sure you read carefully.

請務必仔細閱讀。

• In general, if there is only one x variable, that is, p equals 1, then there is no purpose for subscript j.

一般來說，如果只有一個 x 變量，即 p 等於 1，那麼就不需要下標 j 了。

• So, in this case, the one number in the subscript is usually i.

是以，在這種情況下，下標中的一個數字通常是 i。

• However, if there are multiple x variables, then the one number in the subscript is usually j.

不過，如果有多個 x 變量，那麼下標中的一個數字通常就是 j。

• You may recall from a probability course that we use uppercase letters to represent random variables, such as capital X and capital Y.

您可能還記得，在概率課程中，我們用大寫字母來表示隨機變量，如大寫 X 和大寫 Y。

• We can add subscripts to these letters the same way.

我們可以用同樣的方法為這些字母添加下標。

• Introducing subscript i adds clarity, but it can also make equations or expressions messy and difficult to read.

引入下標 i 會增加清晰度，但也會使等式或表達式變得混亂難讀。

• But we combat this problem by moving to vector and matrix notations.

不過，我們通過改用向量和矩陣符號來解決這個問題。

• If we use a matrix to represent a data set, rows represent the observations, while columns represent the variables.

如果我們用矩陣來表示數據集，那麼行代表觀測值，列代表變量。

• We'll see more of this in future sections, but for now, let's review some basic facts about matrices.

我們將在以後的章節中看到更多這方面的內容，但現在，讓我們回顧一下有關矩陣的一些基本事實。

• First, for matrix A, A superscript T is A's transpose.

首先，對於矩陣 A，A 的上標 T 是 A 的轉置。

• Transposing simply means swapping the rows and columns so that the k-th column becomes the k-th row, and vice versa.

對換簡單地說就是交換行和列，使第 k 列變成第 k 行，反之亦然。

• Notice transposing a matrix also reverses its dimensions.

請注意，矩陣的轉置也會反轉其維度。

• If A is an A by B matrix, then A transpose is a B by A matrix.

如果 A 是一個 A 乘 B 的矩陣，那麼 A 的轉置就是一個 B 乘 A 的矩陣。

• And second, A superscript negative one is A's inverse.

其次，A 的上標負一是 A 的倒數。

• If we multiply a matrix by its inverse in any order, we will get the identity matrix.

如果我們以任何順序將矩陣與它的逆矩陣相乘，就會得到同一矩陣。

• Note that the identity matrix has ones in its diagonal and zeros elsewhere.

請注意，同一矩陣的對角線上為 1，其他地方為 0。

• One of the enemies in this course is confusion.

本課程的敵人之一就是混亂。

• We'll try to minimize confusion by using clear and consistent notation.

我們將盡量使用清晰一致的符號，以減少混淆。

• However, don't assume that the conventions that we use here are universal.

不過，不要以為我們在這裡使用的慣例是通用的。

• Remember, notation only represents concepts.

記住，符號只代表概念。

• However, authors may use different notation to suit their needs.

不過，作者可以根據自己的需要使用不同的符號。

• They may even use the same notation for different but similar concepts.

他們甚至可能對不同但相似的概念使用相同的符號。

• So, train yourself to distinguish the concept from the notation.

是以，要訓練自己區分概念和符號。

In this video, we'll introduce terms and notation that we'll use throughout this course.

# 1 1 數據術語和符號 SRM 精算師輔導 (1 1 Data Terminology and Notation SRM Coaching Actuaries)

• 8 0
楊成明 發佈於 2024 年 06 月 25 日