Lecture 18: Optimization Problems and Algorithms

决定系数 R**2 = 1 – EE/mv

R=1 表示我们构建的模型解释了所有数据的变异性,完美的解释了数据变化

R=0 表示模型预测值和实际数据之间没有任何线性关系,模型没用

1) start with an experiment

2) used computation to both find and evaluate a model

3) use some theory & analysis & computation to derive a consequence of model

Optimization problems 优化

1) an objective function

2) a set of constraints

problem reduction 将新问题映射到旧问题上,使用一些经典的解决方案


Greedy algorithm –at each step choose locally optimal solution 继续阅读MIT600SC笔记18&19


Lecture 16: Using Randomness to Solve Non-random Problems

Computational models如何构建才能帮助理解现实世界

Uniform 每个结果概率相同

Exponential distributions指数分布 memoryless

Analytic model && Simulation model

exponential decay


Utility-what question are answerable效度

Monte Hall三选一 choice is not independent of choice of player. 换的话2/3,不换1/3

随机的办法解决不随机的问题 继续阅读MIT600SC笔记16&17


第十四课:Sampling and Monte Carlo Simulation


Monte Carlo simulation—>Inferential statistics(random sample tends to exhibit the same properties as the population)

Bernoulli’s law (law of large numbers)

Repeated independent tests with the same actual probability, p Chance that fraction of times outcome  occurs converges to p as numbers trials goes to infinity.)

第十五课:Statistical Thinking


How many samples are needed to have confidence in results? 继续阅读MIT600SC笔记14&15


第十二课:Introducing to Simulation and Random Walks

回顾:Person -> MIT Person -> UG/G

Yield is a generator.

A generator is a function that remembers the point in function body where last returned, plus all the local variable.

Analytic methods:

—-predict behavior given some initial conditions and some parameters.

Simulation methods:

—-systems not mathematically tractable

—-successively refining series of simulations

—-easier to extract useful intermediate results




第十课:Hashing and Classes


Hashing is how dictionaries are implemented in python. 牺牲space换取effeciency


hash(i)—->some interger in range(0,k)—>index into a list of lists(list当中的每个元素和对应的list称作bucket)—->找到i是否在bucket中

#找到第i个element的时间是恒定的,如果第一层list足够短,整个过程就很有效率。 继续阅读MIT600SC笔记10&11