The following questions are intended for further practice.
A very inexperienced archer shoots an arrow \(n\) times at a disc of (unknown) radius \(\theta\). The disc is hit every time, but at completely random places. Let \(r_1, \ldots, r_n\) be the distances of the various hits to the center of the disck.
Suppose that \(x_1, \ldots, x_n\) is a dataset, which is a realisation of a random sample from a Rayleigh distribution, which is the continuous distribution with pdf \(f_\theta(x) = \frac{x}{\theta^2}\exp{\lbrace -\frac{1}{2} \frac{x^2}{\theta^2} \rbrace}\) for \(x\geq 0\). Determine the maximum likelihood estimator for \(\theta\).
Suppose that \(x_1, \ldots, x_n\) is a dataset, which is a realisation of a random sample from a distribution with pdf
\[f_\theta(x) = \frac{\theta}{(x + 1)^{\theta + 1}} \text{ for } x > 0\]Determine the MLE for \(\theta\).
Suppose that \(x_1, \ldots, x_n\) is a dataset, which is a realisation of a random sample from a distribution with pdf
\[f_\theta(x) = \begin{cases} e^{\theta - x} & \text{for }\ x > \theta \\ 0 & \text{for }\ x \leq \theta \end{cases}\](+) Suppose that \(x_1, \ldots, x_n\) is a dataset, which is a realisation of a random sample from a distribution with pdf
\[f_\theta(x) = \frac{1}{2} e^{-\lvert x - \theta \rvert} \text{ for }-\infty < x < \infty\]Determine the maximum likelihood estimator for \(\theta\).
Suppose that \(x_1, \ldots, x_n\) is a dataset, which is a realisation of a random sample from a distribution with pdf
\[f_{\mu, \lambda}(x) = \left( \frac{\lambda}{2\pi x^3} \right)^{1/2} \exp{\lbrace - \lambda (x - \mu)^2 / (2\mu^2 x)\rbrace} \text{ for } x>0\]Determine the maximum likelihood estimator for \(\mu\) and \(\lambda\).
Suppose that \(x_1, \ldots, x_n\) is a dataset, which is a realisation of a random sample from a binomial distribution with parameters \((k, p)\), where \(p\) is known and \(k\) is unknown. (For example, this could correspond to flipping a coin with a known probability, but not knowing how many times the coin was flipped).
Explain why if the MLE is \(k\), then
\[\frac{L(k, \vert \mathbf{x}, p)}{L(k-1, \vert \mathbf{x}, p)} \geq 1 \text{ and } \frac{L(k + 1, \vert \mathbf{x}, p)}{L(k, \vert \mathbf{x}, p)} < 1\](+) The unigram model in natural language processing models the probability of a sentence as \(s\) as \(\mathbb{P}(s) = p_{s_1} \cdot p_{s_2} \cdot \ldots \cdot p_{s_n}\) where \(s_1, \ldots, s_n\) are the \(n\) words of the sentence. Given \(M\) sentences \(s^1, \ldots s^M\), show that the MLE for the parameters \(p_{w}\) are \(\frac{c_w}{W}\), where \(c_w\) is the number of times \(w\) occurs in all sentences and \(W\) is the total number of words in all sentences.
Are the following vectors independent?
Show that the if a vector \(v\) belongs to the span of vectors \(v_1, v_2, v_3\) then \(\lbrace v, v_1, v_2, v_3 \rbrace\) are not linearly independent.
Given \(n + 1\) vectors in \(\mathbb{R}^n\), can they be independent? (Do not give a proof for this).
What is the minimum number \(k\) of vectors in \(\mathbb{R}^n\) that can be linearly dependent?
Give \(n\) vectors in \(\mathbb{R}^n\) which are linearly independent.
Explain why the softmax transformation solves Exercise 1.6. Find a different transformation that also solves this problem.
(Optional +) Find a linear time algorithm to find the shortest distance between a pair of points in 2D, given that the distance between \((x_1, y_1)\) and \((x_2, y_2)\) is given by \(\lvert x_1 - x_2\rvert + \lvert y_1-y_2\rvert\).