Placeholder Image

字幕列表 影片播放

  • This is an introduction to modeling in event history analysis.

  • The 1st part deals with the famous Cox model. The brilliant idea of David R Cox in 1972

  • was to combine two types of analysis: regression and life tables. The Cox model

  • can be seen as the control of the effect of the explanatory variables in the

  • survival analysis through regression, or as the introduction of the temporal

  • dimension in the regression. The advantage of one technique can make it

  • possible to fill the gaps of the other. In the case of the logit model, odds of

  • belonging to a category are computed at a given point in the life of the

  • individual regardless of when the status changed. The duration, the elapsed time is

  • therefore an important dimension that is missing in the logit model.

  • In particular, the censoring by the date of the survey or emigration is not taken

  • into account. A good part of the sample whose observations are censored is not

  • taken into account in the analysis if we do not explicitly consider time.

  • On the other hand, if we simply make the description of the event by the survival

  • table technique, it would be difficult to control the influence of explanatory variables.

  • Splitting the sample into different categories according to

  • generations ,or rural origin, etc., leads to small sub-samples with insufficient

  • number for analysis, especially to measure the combined influence of

  • several explanatory factors. To solve both the problem of duration and

  • that of explanatory factors, David Cox's idea was to combine survival analysis with

  • regression analysis. First, Cox proposed a regression not on the characteristics

  • acquired by the individual at the end of his life or at the time of the observation

  • but on the characteristic aquired each year of life. In a way, each year lived

  • by each member of the sample constitutes an observation.

  • The reference category of the regression is not unique for the whole sample

  • but it is specific to each observation period. This series of probabilities makes it possible to

  • establish a reference survival curve, also called a baseline survival function.

  • This is the nonparametric part of the model. Then the Cox regression model

  • calculates the effect of the explanatory variables on the annual risk of

  • experiencing the event. Each variable is associated with a regression coefficient

  • that measures the average effect of this variable on the annual risk.

  • This is the parametric part of the model. In this model h0(t) is the hazard function

  • for the reference category, Bi is a series of coefficients associated with

  • indicator variables Xij. The model therefore has a nonparametric component

  • the baseline hazard function formed from the series of hazards h0(t),

  • and a parametric component, the vector of independent variables.

  • Because of these 2 components, the model is also called the semi-parametric model.

  • in fact, for statistical computations reasons, it is the logarithms of the hazards

  • and not the hazards themselves that are modeled in an additive model.

  • The model is part of the family of log-linear models. But at the moment of analysis, it is usually

  • the exponential of the coefficients that are interpreted as multiplicative effects.

  • The coefficients of the regression do not have an easy

  • and immediate interpretation.

  • From the causal relation point of view,

  • the only explanatory element in this minimal model is the entry of the

  • individual into the population subjected to the risk with such or such characteristics.

  • The relation of the diagram reads: entry into observation O

  • at time (t - 1) with X being a possible cause of the occurrence of

  • event E in the interval (t - 1, t). This representation follows the

  • principle of the anteriority of the cause X on the effect E.

  • The probability of occurrence of the event varies depending on whether the individual has

  • characteristics X or not. It is assumed that the observation time interval is

  • small enough that the risk is constant during the interval. Here again the

  • smaller the interval the weakest this assumption. The calculation is repeated

  • as many times as they are time intervals until the end of observation OBE.

  • Although X is not an event, we can consider it as such on the interval (t-1,t).

  • Indeed, if X is defined at the beginning of each time interval, and if

  • the calculated risk is assumed to be constant over the interval, we approach

  • the causal relationship where O, the observation entry at the beginning of

  • the interval is taken as an explanatory event, since one must be present at time (t-1)

  • to experience the risk in the interval (t - 1, t). We are very close

  • to the basic causal relationship but not quite. The effect X is not calculated

  • separately over each time interval but averaged over all time interval.

  • Each variable X is therefore not associated with a particular unit of time,

  • which distinguishes it from a cause precisely located in time and event.

  • One says that the effect of the variable is proportional to the annual probability

  • of knowing the event. This is why the Cox model

  • is called a proportional hazard model. Let's take a very simple example

  • with a single explanatory variable, for example sex.

  • The variable X is called X1 and

  • the corresponding coefficient B1. This model is as follows:

  • let's see 2 possible cases, either the individual is exposed or is is not.

  • For example either he is a man or is not. If the individual is exposed then X1 is equal to 1

  • and the model is written h0(t) * exp(B1). If the individual is not exposed,

  • then X1 is 0 and the expression is reduced to h0(t).

  • We can see that the exponential of the B1 does not depend on "t" and therefore applies

  • multiplicatively to all the values of h0(t). It is therefore assumed that the

  • explanatory variables apply to the entire hazard function whatever "t".

  • This assumption of proportionality is quite strong and it is necessary to test it

  • for each variable of the model. If it is not verified, the model becomes

  • inconsistent and it is then necessary to consider stratifying the sample

  • according to the incriminated variable. Graphical and statistical methods make

  • it possible to test this assumption which we'll see in the following screencast.

  • Thank you for your attention... and work well!

This is an introduction to modeling in event history analysis.

字幕與單字

單字即點即查 點擊單字可以查詢單字解釋

B1 中級 美國腔

事件歷史分析:考克斯模型 (Event History Analysis: the Cox model)

  • 22 1
    qianh7 發佈於 2021 年 01 月 14 日
影片單字