<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
    <title>Break&#x27;s Forge World</title>
    <subtitle>Break&#x27;s Forge World</subtitle>
    <link rel="self" type="application/atom+xml" href="https://www.breakds.org/atom.xml"/>
    <link rel="alternate" type="text/html" href="https://www.breakds.org"/>
    <generator uri="https://www.getzola.org/">Zola</generator>
    <updated>2024-08-25T00:00:00+00:00</updated>
    <id>https://www.breakds.org/atom.xml</id>
    <entry xml:lang="en">
        <title>The Intuitive VAE</title>
        <published>2024-08-25T00:00:00+00:00</published>
        <updated>2024-08-25T00:00:00+00:00</updated>
        
        <author>
          <name>
            
              Unknown
            
          </name>
        </author>
        
        <link rel="alternate" type="text/html" href="https://www.breakds.org/the-intuitive-vae/"/>
        <id>https://www.breakds.org/the-intuitive-vae/</id>
        
        <content type="html" xml:base="https://www.breakds.org/the-intuitive-vae/">&lt;p&gt;Variational Autoencoder, also known as VAE, is an elegant algorithm in machine learning. This post summarizes my attempt to teach the math behind VAE in an intuitive way.&lt;&#x2F;p&gt;
&lt;h1 id=&quot;maximum-likelihood-estimation-mle&quot;&gt;Maximum Likelihood Estimation (MLE)&lt;&#x2F;h1&gt;
&lt;p&gt;A common problem (arguably the central problem) in machine learning is learning the underlying distribution of a dataset $X_{\text{Data}}$. This dataset contains $n$ samples:&lt;&#x2F;p&gt;
&lt;p&gt;$$
X_{\text{Data}} = {x_1, x_2, \cdots, x_n }, \text{ where } x_i \in \mathbb{R}^d
$$&lt;&#x2F;p&gt;
&lt;p&gt;Assuming the underlying distribution has a probability density function (pdf) of $p(x)$, we aim to find (or fit) a parameterized function $p_\theta(x)$, such that by optimizing with respect to $\theta$,&lt;&#x2F;p&gt;
&lt;p&gt;$$
p_\theta(x) \text{ can closely approximate } p(x)
$$&lt;&#x2F;p&gt;
&lt;p&gt;Why do we want to learn $p(x)$? Because:&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;We can then sample from it (e.g., &lt;a rel=&quot;noopener&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;openai.com&#x2F;index&#x2F;dall-e-3&#x2F;&quot;&gt;DALL·E 3&lt;&#x2F;a&gt;).&lt;&#x2F;li&gt;
&lt;li&gt;We can evaluate $p_\theta(y)$ to determine how likely $y$ is to be a sample from the underlying distribution.&lt;&#x2F;li&gt;
&lt;li&gt;We can use it for various downstream tasks.&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;p&gt;The maximum likelihood estimation approach suggests maximizing the following objective with respect to $\theta$:&lt;&#x2F;p&gt;
&lt;p&gt;$$
\theta^{*} = \arg\max_{\theta} \prod_{i=1}^n p_\theta(x_i)
$$&lt;&#x2F;p&gt;
&lt;p&gt;This is often simplified by taking the logarithm of the objective, converting the product into a &lt;strong&gt;sum&lt;&#x2F;strong&gt;:&lt;&#x2F;p&gt;
&lt;p&gt;$$
\theta^{*} = \arg\max_{\theta} \sum_{i=1}^n \log p_\theta(x_i)
$$&lt;&#x2F;p&gt;
&lt;p&gt;Now, if we assume the underlying distribution is as simple as a Gaussian, $p_\theta$ can be parameterized with $\theta = (\mu, \sigma)$:&lt;&#x2F;p&gt;
&lt;p&gt;$$
p_\theta(x) = \frac{1}{\sqrt{2 \pi}\sigma} \exp \left( -\frac{(x - \mu)^2}{2 \sigma^2}\right)
$$&lt;&#x2F;p&gt;
&lt;p&gt;The optimization then becomes:&lt;&#x2F;p&gt;
&lt;p&gt;$$
\theta^{*} = \arg\min_{\theta} \left[n \cdot \log \sigma + \frac{1}{2\sigma^2}\sum_{i=1}^n (x_i - \mu)^2 \right]
$$&lt;&#x2F;p&gt;
&lt;p&gt;Finding the minimum is straightforward:&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
\mu &amp;amp;= \text{np.mean}(X_\text{Data}) \\
\sigma &amp;amp;= \text{np.std}(X_\text{Data})
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;h1 id=&quot;bring-in-the-generative-model&quot;&gt;Bring in the Generative Model&lt;&#x2F;h1&gt;
&lt;p&gt;Although modeling using a single Gaussian, as described above, is widely used, it is not suitable when the underlying distribution is very complex (e.g., the set of all meaningful images). A common technique to model such complex distributions is to assume the generation process follows something like $p(z) \cdot p(x|z)$. This means you first generate a &lt;strong&gt;code&lt;&#x2F;strong&gt; $z$ from $p(z)$, which broadly describes the characteristics of the final data point, and then sample the final data point from $p(x|z)$.&lt;&#x2F;p&gt;
&lt;p&gt;This assumption is intuitive, and we can understand it through an analogy of sampling a random human being. First, a gene is sampled, which determines many traits of the human, such as height, skin color, etc. We can then sample many individuals from this gene pool, and although they might all be different, they should share some similarities within the cohort. Here, the gene corresponds to $z$, which encodes the general characteristics, and the &quot;generated&quot; human corresponds to $x$. We often assume $p(z) = \mathcal{N}(0, I)$ because many traits are arguably Gaussian distributed&lt;sup class=&quot;footnote-reference&quot;&gt;&lt;a href=&quot;#1&quot;&gt;1&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;Given this, estimating $p(x)$ becomes all about estimating the underlying $p(x|z)$ since $p(z) = \mathcal{N}(0, I)$ is assumed to be &lt;strong&gt;known&lt;&#x2F;strong&gt;. If we model $p(x|z)$ with a neural network parameterized by $\theta$, we can write the model for $p(x)$ as&lt;&#x2F;p&gt;
&lt;p&gt;$$
p_\theta(x) = \int_z p_\theta(x|z)p(z) , \mathrm{d}z
$$&lt;&#x2F;p&gt;
&lt;p&gt;&lt;strong&gt;Terminology&lt;&#x2F;strong&gt;: We will use the subscript $\theta$ to indicate that a distribution&#x2F;pdf is induced by the parameterized pdf $p_\theta(x|z)$.&lt;&#x2F;p&gt;
&lt;p&gt;Unfortunately, directly optimizing this model using the MLE objective is intractable due to the integral. For now, let&#x27;s take a closer look at the model.&lt;&#x2F;p&gt;
&lt;p&gt;It is possible to generate the same $x$ from two different codes, $z_1$ and $z_2$, as long as $p_\theta(x|z_1) &amp;gt; 0$ and $p_\theta(x|z_2) &amp;gt; 0$. This implies that the answer to &quot;how likely each code can generate me&quot; indicates the existence of another set of induced distributions:&lt;&#x2F;p&gt;
&lt;p&gt;$$
p_\theta(z|x) = \frac{p_\theta(x|z) \cdot p(z)}{p_\theta(x)}
$$&lt;&#x2F;p&gt;
&lt;p&gt;From now on, we will refer to $p_\theta(x|z)$ as the (learnable) &lt;strong&gt;decoder&lt;&#x2F;strong&gt; because it deciphers the code into an actual sample. We will call $p_\theta(z|x)$ the $\theta$-induced &lt;strong&gt;encoder&lt;&#x2F;strong&gt;. The &quot;$\theta$-induced&quot; prefix is important because we want to distinguish it from something else introduced in the next section.&lt;&#x2F;p&gt;
&lt;div class=&quot;footnote-definition&quot; id=&quot;1&quot;&gt;&lt;sup class=&quot;footnote-definition-label&quot;&gt;1&lt;&#x2F;sup&gt;
&lt;p&gt;It is perfectly fine to use another form for $p(z)$, but the Gaussian distribution is one of the easiest to work with.&lt;&#x2F;p&gt;
&lt;&#x2F;div&gt;
&lt;h1 id=&quot;kl-divergence-and-eblo&quot;&gt;KL Divergence and EBLO&lt;&#x2F;h1&gt;
&lt;p&gt;An embarrassing fact about the above generative model is that even if we find a way to successfully learn the &lt;strong&gt;decoder&lt;&#x2F;strong&gt; $p_\theta(x|z)$, it would still be difficult to recover the induced encoder $p_\theta(z|x)$&lt;sup class=&quot;footnote-reference&quot;&gt;&lt;a href=&quot;#2&quot;&gt;2&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt;. A natural idea is to learn a separate &lt;strong&gt;encoder&lt;&#x2F;strong&gt; $q_\phi(z|x)$, &lt;strong&gt;independently parameterized&lt;&#x2F;strong&gt; by $\phi$.&lt;&#x2F;p&gt;
&lt;p&gt;I understand your concern: what if $q_\phi(z|x)$ is not consistent with the $\theta$-induced encoder $p_\theta(z|x)$? Well, we can start by analyzing a term that measures the inconsistency between two distributions, namely the &lt;a rel=&quot;noopener&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Kullback%E2%80%93Leibler_divergence&quot;&gt;KL divergence&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;$$
\mathrm{KL}(q_\phi(z|x) \mathrel{\Vert} p_\theta(z|x) ) = \mathbb{E}_{z \sim q_\phi(\cdot|x)} \left[ \log \frac{q_\phi(z|x)}{p_\theta(z|x)} \right]
$$&lt;&#x2F;p&gt;
&lt;p&gt;In hindsight, we want to transform the above expression to make $p_\theta(x)$ appear, so that it relates to the MLE objective. This is easily done by applying Bayes&#x27; theorem to $p_\theta(z|x)$ (twice):&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{darray}{rcl}
\mathrm{KL}(q_\phi(z|x) \mathrel{\Vert} p_\theta(z|x) ) &amp;amp;=&amp;amp; \mathbb{E}_{z \sim q_\phi(\cdot|x)} \left[ \log \frac{q_\phi(z|x)}{p_\theta(z|x)} \right] \\
&amp;amp;=&amp;amp; \mathbb{E}_{z \sim q_\phi(\cdot|x)} \left[ \log \frac{q_\phi(z|x) \cdot p_\theta(x)}{p_\theta(x|z) \cdot p(z)} \right] \\
&amp;amp;=&amp;amp; \mathbb{E}_{z \sim q_\phi(\cdot|x)} \left[ \log p_\theta(x) \right] +
\mathbb{E}_{z \sim q_\phi(\cdot|x)} \left[ \log \frac{q_\phi(z|x)}{p(z)} \right] - \mathbb{E}_{z \sim q_\phi(\cdot|x)} \left[ \log p_\theta(x|z) \right] \\
&amp;amp;=&amp;amp; \log p_\theta(x) + \mathrm{KL}\left(q_\phi(z|x) \mathrel{\Vert} p(z) \right) - \mathbb{E}_{z \sim q_\phi(\cdot|x)} \left[ \log p_\theta(x|z) \right]
\end{darray}
$$&lt;&#x2F;p&gt;
&lt;p&gt;The most confusing part might be why we can take $p_\theta(x)$ out of the expectation. This is because the expectation is with respect to $z$, and $p_\theta(x)$ is independent of $z$. This is why I explicitly write out the variable of the expectation. Rearranging the terms on both sides of the equation, we have:&lt;&#x2F;p&gt;
&lt;p&gt;$$
\log p_\theta(x) = \mathbb{E}_{z \sim q_\phi(\cdot|x)} \left[ \log p_\theta(x|z) \right] - \mathrm{KL}\left(q_\phi(z|x) \mathrel{\Vert} p(z) \right) + \mathrm{KL}(q_\phi(z|x) \mathrel{\Vert} p_\theta(z|x) )
$$&lt;&#x2F;p&gt;
&lt;p&gt;If we denote part of the right-hand side as:&lt;&#x2F;p&gt;
&lt;p&gt;$$
\mathrm{ELBO} = \mathbb{E}_{z \sim q_\phi(\cdot|x)} \left[ \log p_\theta(x|z) \right] - \mathrm{KL}\left(q_\phi(z|x) \mathrel{\Vert} p(z) \right)
$$&lt;&#x2F;p&gt;
&lt;p&gt;Then the equation for $\log p_\theta(x)$ becomes:&lt;&#x2F;p&gt;
&lt;p&gt;$$
\log p_\theta(x) = \mathrm{ELBO} + \mathrm{KL}(q_\phi(z|x) \mathrel{\Vert} p_\theta(z|x) )
$$&lt;&#x2F;p&gt;
&lt;p&gt;Because KL divergence is always &lt;strong&gt;non-negative&lt;&#x2F;strong&gt;, the ELBO is a &lt;strong&gt;lower bound&lt;&#x2F;strong&gt; on $\log p_\theta(x)$, which is precisely what we want to maximize in MLE. This is also how the ELBO gets its name (Evidence Lower Bound).&lt;&#x2F;p&gt;
&lt;div class=&quot;footnote-definition&quot; id=&quot;2&quot;&gt;&lt;sup class=&quot;footnote-definition-label&quot;&gt;2&lt;&#x2F;sup&gt;
&lt;p&gt;The induced encoder refers to the conditional distribution $p_\theta(z|x)$, which is difficult to compute directly.&lt;&#x2F;p&gt;
&lt;&#x2F;div&gt;
&lt;h1 id=&quot;properties-of-elbo&quot;&gt;Properties of ELBO&lt;&#x2F;h1&gt;
&lt;p&gt;Can we just maximize the ELBO as a surrogate to maximize $\log p_\theta(x)$? The answer is (not a straightforward) yes. It’s not straightforward because, typically, maximizing a lower bound does not necessarily mean the objective value is maximized—unless the gap is zero!&lt;&#x2F;p&gt;
&lt;p&gt;Let&#x27;s examine the most interesting property of the ELBO, as highlighted by Hung-yi Lee in his &lt;a rel=&quot;noopener&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=8zomhgKrsmQ&amp;amp;t=14s&quot;&gt;talk about VAE&lt;&#x2F;a&gt;. What happens if we &lt;strong&gt;keep $\theta$ fixed&lt;&#x2F;strong&gt; and maximize the ELBO with respect to $\phi$?&lt;&#x2F;p&gt;
&lt;p&gt;$$
\phi^{*} = \arg\max_{\phi} \mathrm{ELBO}
$$&lt;&#x2F;p&gt;
&lt;p&gt;In this case, since $\log p_\theta(x)$ does not change (because $\theta$ is fixed), a maximized ELBO must imply a &lt;strong&gt;minimized gap&lt;&#x2F;strong&gt;:&lt;&#x2F;p&gt;
&lt;p&gt;$$
\log p_\theta(x) = \mathrm{ELBO} \uparrow + \mathrm{KL}(q_\phi(z|x) \mathrel{\Vert} p_\theta(z|x)) \downarrow
$$&lt;&#x2F;p&gt;
&lt;p&gt;However, note that the gap is a KL divergence, whose &lt;strong&gt;minimum value is 0&lt;&#x2F;strong&gt;! Although it may be difficult for the optimizer to reach a global optimum in practice, this clearly suggests that the effect of maximizing the ELBO with respect to $\phi$ is to drive the KL divergence term toward 0, making the ELBO a &lt;strong&gt;tighter lower bound&lt;&#x2F;strong&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;This also makes sense from another perspective, because the mathematical meaning of a shrinking $\mathrm{KL}(q_\phi(z|x) \mathrel{\Vert} p_\theta(z|x))$ is that the learned encoder $q_\phi(z|x)$ is becoming more consistent with the $\theta$-induced encoder $p_\theta(z|x)$.&lt;&#x2F;p&gt;
&lt;p&gt;We now arrive at a more intuitive understanding of (maximizing) the ELBO. If we maximize the ELBO with respect to both $\theta$ and $\phi$, it will simultaneously attempt to:&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;Raise the lower bound of the target &lt;strong&gt;higher and higher&lt;&#x2F;strong&gt;.&lt;&#x2F;li&gt;
&lt;li&gt;&lt;strong&gt;Close the gap&lt;&#x2F;strong&gt; between the lower bound and the target.&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;h1 id=&quot;computing-elbo-for-training&quot;&gt;Computing ELBO for Training&lt;&#x2F;h1&gt;
&lt;p&gt;Recall that the ELBO is defined as:&lt;&#x2F;p&gt;
&lt;p&gt;$$
\mathrm{ELBO} = \mathbb{E}_{z \sim q_\phi(\cdot|x)} \left[ \log p_\theta(x|z) \right] - \mathrm{KL}\left(q_\phi(z|x) \mathrel{\Vert} p(z) \right)
$$&lt;&#x2F;p&gt;
&lt;p&gt;The &lt;strong&gt;first term&lt;&#x2F;strong&gt; $\mathbb{E}_{z \sim q_\phi(\cdot|x)} \left[ \log p_\theta(x|z) \right]$ essentially means that, for each sample $x$ in the batch, we should first evaluate $q_\phi(\cdot|x)$ and sample a code $z$ from the $x$-conditioned distribution. We can then use $\log p_\theta(x|z)$ for this particular $z$ as a &lt;strong&gt;representative&lt;&#x2F;strong&gt; of the first term. This approach is common when working with expectations in the loss function since, statistically, they are equivalent.&lt;&#x2F;p&gt;
&lt;p&gt;If your model $p_\theta(x|z)$ is defined as a conditional Gaussian distribution (which is typically the case when using VAE), $p_\theta(x|z)$ is just $\mathcal{N}(\mu_\theta(z), \sigma_\theta(z))$. In this case, maximizing the first term is &lt;strong&gt;almost&lt;&#x2F;strong&gt;[^3] equivalent to minimizing the Euclidean distance between $\mu_\theta$ and $x$ (i.e., $p_\theta(x|z)$ should reconstruct $x$). This is why this term is often referred to as the &lt;strong&gt;reconstruction loss&lt;&#x2F;strong&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;The &lt;strong&gt;second term&lt;&#x2F;strong&gt; is even simpler. It implies that $q_\phi(z|x)$, even when conditioned on $x$, should not deviate too far from the &lt;strong&gt;prior distribution&lt;&#x2F;strong&gt; $p(z)$. When $q_\phi(z|x)$ is modeled as a conditional Gaussian distribution, and $p(z)$ is also assumed to be Gaussian, the KL divergence can be computed in &lt;strong&gt;closed form&lt;&#x2F;strong&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;In practice, you can combine these two terms with a weighting factor as a hyperparameter, allowing you to emphasize which term&#x27;s effect is more important for your specific application. You can even schedule the weight dynamically to perform curriculum training.&lt;&#x2F;p&gt;
&lt;p&gt;[^3] Technically you also need to care about $\sigma_\theta(z)$&lt;&#x2F;p&gt;
&lt;hr &#x2F;&gt;
&lt;h1 id=&quot;comments&quot;&gt;Comments&lt;&#x2F;h1&gt;
&lt;p&gt;The current zola theme I am using does not support uterrances. I considering switching to &lt;a rel=&quot;noopener&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;welpo&#x2F;tabi&quot;&gt;this theme&lt;&#x2F;a&gt; to enable uterrances comments, but for now, if you have comments or discussion, please leave them as &lt;a rel=&quot;noopener&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;breakds&#x2F;www.breakds.org&#x2F;issues&quot;&gt;github issues&lt;&#x2F;a&gt; manually. Sorry about the inconvenience!&lt;&#x2F;p&gt;
</content>
        
    </entry>
    <entry xml:lang="en">
        <title>Transformation of Probabilistic Distributions</title>
        <published>2023-04-14T00:00:00+00:00</published>
        <updated>2023-04-14T00:00:00+00:00</updated>
        
        <author>
          <name>
            
              Unknown
            
          </name>
        </author>
        
        <link rel="alternate" type="text/html" href="https://www.breakds.org/transformation-of-probabilistic-distribution/"/>
        <id>https://www.breakds.org/transformation-of-probabilistic-distribution/</id>
        
        <content type="html" xml:base="https://www.breakds.org/transformation-of-probabilistic-distribution/">&lt;h1 id=&quot;motivation&quot;&gt;Motivation&lt;&#x2F;h1&gt;
&lt;blockquote&gt;
&lt;p&gt;To use &lt;code&gt;rsample&lt;&#x2F;code&gt; or not to use &lt;code&gt;rsample&lt;&#x2F;code&gt;, that is a question.&lt;&#x2F;p&gt;
&lt;&#x2F;blockquote&gt;
&lt;p&gt;If you ever come across the above when implementing a deep learning algorithm, for example, a policy gradient algorithm for reinforcement learning, this post is about that.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;strong&gt;Disclaimer&lt;&#x2F;strong&gt;: Please note that I am more interested in making the math intuitive rather than strict here.&lt;&#x2F;p&gt;
&lt;h1 id=&quot;the-concepts&quot;&gt;The concepts&lt;&#x2F;h1&gt;
&lt;h2 id=&quot;transformation-of-random-variables&quot;&gt;Transformation of Random Variables&lt;&#x2F;h2&gt;
&lt;p&gt;Our exploration starts with a math problem that you might find in the assignments from an introductory level statistics course. Suppose we have an 1D random variable $\mathbf{X}$ following Gaussian distribution,&lt;&#x2F;p&gt;
&lt;p&gt;$$
\mathbf{X} \sim \mathcal{N}(0, 1)
$$&lt;&#x2F;p&gt;
&lt;p&gt;the probabilistic density function (pdf) of this distribtion is&lt;&#x2F;p&gt;
&lt;p&gt;$$
p(x) = \frac{1}{\sqrt{2\pi}} \exp \left( -\frac{x^2}{2} \right)
$$&lt;&#x2F;p&gt;
&lt;p&gt;Now, assume $\mathbf{Y}$ is another random variable that has a close relationship with $\mathbf{X}$.&lt;&#x2F;p&gt;
&lt;p&gt;$$
\mathbf{Y} = \sigma \mathbf{X} + \mu =: f(\mathbf{X})
$$&lt;&#x2F;p&gt;
&lt;p&gt;&lt;strong&gt;Question&lt;&#x2F;strong&gt;: What is the pdf of $\mathbf{Y}$? What kind of distribution does $\mathbf{Y}$ follow?&lt;&#x2F;p&gt;
&lt;p&gt;To answer this question, we can start from simple principles. Let&#x27;s denote the pdf of $\mathbf{Y}$ as $q(\cdot)$. The integral of both pdf $p(\cdot)$ and $q(\cdot)$ over $\mathbb{R}$ must be equal to 1.&lt;&#x2F;p&gt;
&lt;p&gt;$$
1 = \int_{-\infty}^{+\infty}p(x) \mathrm{d}x = \int_{-\infty}^{+\infty}q(y) \mathrm{d}y
$$&lt;&#x2F;p&gt;
&lt;p&gt;It is not hard to deduce that, based on intuition, the above equation that constraints $p(\cdot)$ and $q(\cdot)$ is too &lt;strong&gt;loose&lt;&#x2F;strong&gt;. The probability of $\mathbf{X}$ taking any value $x$ must be equal to the probability of $\mathbf{Y}$ taking the corresponding value $y = f(x)$. This means that&lt;&#x2F;p&gt;
&lt;p&gt;$$
\forall x \in \mathbb{R}, \mathbb{P} \{ X = x \} = \mathbb{P} \{ Y = f(x) \}
$$&lt;&#x2F;p&gt;
&lt;p&gt;The above equation can be expressed with derivative form (well intuively but inaccurately you can understand this as the form you usually write to the right of $\int$) equivalence as below&lt;&#x2F;p&gt;
&lt;p&gt;$$
\forall x \in \mathbb{R} \text{ and } y = f(x), p(x) \mathrm{d}x = q(y) \mathrm{d}y
$$&lt;&#x2F;p&gt;
&lt;p&gt;Note that since $y = f(x)$, and fortunately $f$ is a linear function so that it can be inversed, we can now simplify the above as&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{darray}{rcl}
&amp;amp;&amp;amp;p(x) \mathrm{d}x &amp;amp;&amp;amp; = q(y) \mathrm{d}y \\
\Rightarrow &amp;amp;&amp;amp; p(f^{-1}(y)) \mathrm{d}(f^{-1}(y)) &amp;amp;&amp;amp; = q(y) \mathrm{d}y
\end{darray}
$$&lt;&#x2F;p&gt;
&lt;p&gt;We are now blocked by $\mathrm{d}(f^{-1}(y))$, which is a derivative form. Without a formal definition, we can simplify it with very intuitive rules. For example, in this case since $f(x) = \sigma x + \mu$,&lt;&#x2F;p&gt;
&lt;p&gt;$$
f^{-1}(y) = \frac{y - \mu}{\sigma}
$$&lt;&#x2F;p&gt;
&lt;p&gt;The form $\mathrm{d}(f^{-1}(y))$ is pretty much in plain word just the area under $f^{-1}(\cdot)$ within an infinitesimal neighborhood around a specific $y$. By just applying a symbolic trick (dividing $\mathrm{d}y$ and multiply it back), we can derive:&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{darray}{rcl}
\mathrm{d}(f^{-1}(y)) &amp;amp;&amp;amp; = &amp;amp;&amp;amp; \frac{\mathrm{d}(f^{-1}(y))}{\mathrm{d}y} \cdot \mathrm{d} y \\
&amp;amp;&amp;amp; = &amp;amp;&amp;amp; \frac{\mathrm{d}((y - \mu) &#x2F; \sigma)}{\mathrm{d}y} \cdot \mathrm{d}y \\
&amp;amp;&amp;amp; = &amp;amp;&amp;amp; \frac{1}{\sigma} \cdot \mathrm{d} y \\
&amp;amp;&amp;amp; = &amp;amp;&amp;amp; \frac{\mathrm{d}y}{\sigma}
\end{darray}
$$&lt;&#x2F;p&gt;
&lt;p&gt;Please note that $\mathrm{d}(\cdot)&#x2F;\mathrm{d}y$ is just what we usually call &lt;strong&gt;the derivative w.r.t $y$a&lt;&#x2F;strong&gt;. More generally, if the transformation function $f$ is invertible, the above equation is reduced to&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{equation}
\mathrm{d}(f^{-1}(y)) = \frac{\mathrm{d} y}{f&#x27;(f^{-1}(y))}
\end{equation}
$$&lt;&#x2F;p&gt;
&lt;p&gt;Now, we have&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{darray}{rcl}
&amp;amp;&amp;amp;p(x) \mathrm{d}x &amp;amp;&amp;amp; = q(y) \mathrm{d}y &amp;amp;&amp;amp; \\
\Rightarrow &amp;amp;&amp;amp; p(f^{-1}(y)) \mathrm{d}(f^{-1}(y)) &amp;amp;&amp;amp; = q(y) \mathrm{d}y &amp;amp;&amp;amp; \\
\Rightarrow &amp;amp;&amp;amp; p(f^{-1}(y)) \frac{\mathrm{d} y}{f&#x27;(f^{-1}(y))} &amp;amp;&amp;amp; = q(y) \mathrm{d}y &amp;amp;&amp;amp; \\
\Rightarrow &amp;amp;&amp;amp; p(\frac{y - \mu}{\sigma}) \cdot \frac{\mathrm{d}y}{\sigma} &amp;amp;&amp;amp; = q(y) \mathrm{d}y &amp;amp;&amp;amp; \\
\Rightarrow &amp;amp;&amp;amp; \frac{1}{\sigma} p(\frac{y - \mu}{\sigma}) &amp;amp;&amp;amp; = q(y) &amp;amp;&amp;amp; (\text{ eliminate } \mathrm{d}y)\\
\Rightarrow &amp;amp;&amp;amp; q(y) = \frac{1}{\sqrt{2 \pi} \sigma} \exp -\frac{(y - \mu)^2}{\sigma^2}  &amp;amp;&amp;amp;
\end{darray}
$$&lt;&#x2F;p&gt;
&lt;p&gt;Now, we have proved (sort of intuitively and symblically via derivative form arithmetics) that such linear transformation of the random variable $\mathrm{X}$ still follows Gaussian distribution. In fact,&lt;&#x2F;p&gt;
&lt;p&gt;$$
\mathrm{Y} \sim \mathcal{N}(\mu, \sigma^2)
$$&lt;&#x2F;p&gt;
&lt;p&gt;Pretty straight forward, right?&lt;&#x2F;p&gt;
</content>
        
    </entry>
    <entry xml:lang="en">
        <title>VLAN Configuration by Examples</title>
        <published>2023-02-11T00:00:00+00:00</published>
        <updated>2023-02-11T00:00:00+00:00</updated>
        
        <author>
          <name>
            
              Unknown
            
          </name>
        </author>
        
        <link rel="alternate" type="text/html" href="https://www.breakds.org/vlan-configuration-by-examples/"/>
        <id>https://www.breakds.org/vlan-configuration-by-examples/</id>
        
        <content type="html" xml:base="https://www.breakds.org/vlan-configuration-by-examples/">&lt;h1 id=&quot;why-am-i-writing-this&quot;&gt;Why am I writing this?&lt;&#x2F;h1&gt;
&lt;p&gt;As I worked on upgrading my home network with a NixOS router, I found myself once again needing to
update the VLAN configuration on my Aruba Instant On 1930 PoE switch. However, I felt hesitant to do
so due to my previous struggles in grasping the concept of VLAN despite reading multiple online
articles.&lt;&#x2F;p&gt;
&lt;p&gt;Fortunately, my friend Hao recommended an informative &lt;a rel=&quot;noopener&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;zhuanlan.zhihu.com&#x2F;p&#x2F;545383921&quot;&gt;post on the topic&lt;&#x2F;a&gt; which, combined with an hour of experimentation, finally
allowed me to understand VLAN sufficiently to implement my ideas. In this post, I aim to share my
newfound practical knowledge with examples, hoping to assist others who may have encountered similar
difficulties.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;strong&gt;Disclaimer&lt;&#x2F;strong&gt;: Not being a network engineer, my understanding and explanation of VLANs are based on a simplified mental model. While I believe that this model is both easy to understand and accurate enough for practical use, it may not encompass all technical intricacies and complexities of the concept.&lt;&#x2F;p&gt;
&lt;h1 id=&quot;example-1-an-unmanaged-switch&quot;&gt;Example 1: An Unmanaged Switch&lt;&#x2F;h1&gt;
&lt;p&gt;A switch, specifically a Layer 2 (L2) switch, is a networking device with several physical ports,
each typically featuring an RJ45 or SFF Ethernet interface. Every port is capable of connecting to a
single device and the switch operates on L2 using MAC addresses.&lt;&#x2F;p&gt;
&lt;pre class=&quot;z-code&quot;&gt;&lt;code&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;                          Ports
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;            1   2   3   4   5   6   7   8       
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +---+---+---+---+---+---+---+---+   
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          |   |   |   |   |   |   |   |   |   
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +-|-+-|-+-|-+-|-+---+---+---+---+   
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;            A   B   C   D
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;For example, if you connect devices &lt;code&gt;A&lt;&#x2F;code&gt;, &lt;code&gt;B&lt;&#x2F;code&gt;, &lt;code&gt;C&lt;&#x2F;code&gt;, and &lt;code&gt;D&lt;&#x2F;code&gt; to ports 1, 2, 3, and
4 of an unmanaged switch (see above), the devices will be interconnected. Device
&lt;code&gt;A&lt;&#x2F;code&gt; and &lt;code&gt;B&lt;&#x2F;code&gt; can send packets to each other, as if they were directly connected with
an Ethernet cable.&lt;&#x2F;p&gt;
&lt;p&gt;When &lt;code&gt;A&lt;&#x2F;code&gt; sends a packet to port 1, it &lt;strong&gt;enters&lt;&#x2F;strong&gt; the switch, and when the packet
reaches &lt;code&gt;B&lt;&#x2F;code&gt; via port 2, it &lt;strong&gt;leaves&lt;&#x2F;strong&gt; the switch. In this post, the statement
&quot;&lt;code&gt;A&lt;&#x2F;code&gt; and &lt;code&gt;B&lt;&#x2F;code&gt; can send packets to each other&quot; means that the packet is not
dropped upon entering the switch via port 1 or upon leaving the switch via port
2 &lt;strong&gt;VLAN rules&lt;&#x2F;strong&gt;.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;tagged-packet-and-untagged-packet&quot;&gt;Tagged Packet and Untagged Packet&lt;&#x2F;h2&gt;
&lt;p&gt;A packet can be tagged with a VLAN ID, which is just an integer. A packet that
has a VLAN ID is called a &quot;tagged packet&quot;, and a packet that does not have a
VLAN ID is called an &quot;untagged packet&quot;.&lt;&#x2F;p&gt;
&lt;h1 id=&quot;example-2-managed-switch-with-single-vlan&quot;&gt;Example 2: Managed Switch with Single VLAN&lt;&#x2F;h1&gt;
&lt;p&gt;In this example, let&#x27;s assume there is only one VLAN ID &lt;code&gt;10&lt;&#x2F;code&gt;, and a packet can
either be tagged with &lt;code&gt;VLAN 10&lt;&#x2F;code&gt; or untagged.&lt;&#x2F;p&gt;
&lt;p&gt;In a managed switch, each physical port can be configured as &quot;Tagged&quot; (T),
&quot;Untagged&quot; (U), or &quot;Not Participating&quot; (blank) with respect to &lt;code&gt;VLAN 10&lt;&#x2F;code&gt;. By
changing the interconnectivity of the ports for a specific VLAN, we can form a
virtual switch for that VLAN. Consider the example below:&lt;&#x2F;p&gt;
&lt;pre class=&quot;z-code&quot;&gt;&lt;code&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;                          Ports
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;            1   2   3   4   5   6   7   8       
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +---+---+---+---+---+---+---+---+   
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;VLAN 10   | T | U | U |   |   |   |   |   |   
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +-|-+-|-+-|-+-|-+---+---+---+---+   
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;            A   B   C   D
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Here, ports 1, 2, and 3 form a virtual switch for &lt;code&gt;VLAN 10&lt;&#x2F;code&gt;, with port 1 being
&quot;Tagged&quot; for &lt;code&gt;VLAN 10&lt;&#x2F;code&gt;, and ports 2 and 3 being &quot;Untagged&quot; for &lt;code&gt;VLAN 10&lt;&#x2F;code&gt;. We can
temporarily ignore the other ports, as they are not participating in the virtual
switch for &lt;code&gt;VLAN 10&lt;&#x2F;code&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;The rules for the virtual switch are straightforward. We only need to consider
the behavior of the packet when it enters and leaves the virtual switch.&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;If a physical port is &quot;Tagged&quot; for VLAN 10:
&lt;ul&gt;
&lt;li&gt;It only allows packets tagged with VLAN 10 to enter.&lt;&#x2F;li&gt;
&lt;li&gt;When a packet leaves this port, it will be tagged with VLAN 10.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;&#x2F;li&gt;
&lt;li&gt;If a physical port is &quot;Untagged&quot; for VLAN 10:
&lt;ul&gt;
&lt;li&gt;It only allows untagged packets to enter.&lt;&#x2F;li&gt;
&lt;li&gt;When a packet leaves this port, it will be untagged, regardless of whether
it was tagged with VLAN 10 before.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;With these rules in mind, we can understand the behavior of the packet in
different scenarios. For example:&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;If device &lt;code&gt;A&lt;&#x2F;code&gt; sends an untagged packet to port 1, it will be dropped because
port 1 is a tagged port for &lt;code&gt;VLAN 10&lt;&#x2F;code&gt; and only accepts packets tagged with
&lt;code&gt;VLAN 10&lt;&#x2F;code&gt;.&lt;&#x2F;li&gt;
&lt;li&gt;If device &lt;code&gt;A&lt;&#x2F;code&gt; sends a packet tagged with &lt;code&gt;VLAN 10&lt;&#x2F;code&gt; to port 1, the packet will
enter the switch and reach devices &lt;code&gt;B&lt;&#x2F;code&gt; and &lt;code&gt;C&lt;&#x2F;code&gt; via ports 2 and 3 as untagged
packets. The tag &lt;code&gt;VLAN 10&lt;&#x2F;code&gt; will be stripped when the packet leaves ports 2
and 3 because they are untagged ports.&lt;&#x2F;li&gt;
&lt;li&gt;If device &lt;code&gt;B&lt;&#x2F;code&gt; sends an untagged packet to port 2, it will be accepted and
delivered to device &lt;code&gt;A&lt;&#x2F;code&gt; as a packet tagged with &lt;code&gt;VLAN 10&lt;&#x2F;code&gt;, and to device &lt;code&gt;C&lt;&#x2F;code&gt;
as an untagged packet.&lt;&#x2F;li&gt;
&lt;li&gt;If device &lt;code&gt;C&lt;&#x2F;code&gt; sends a packet tagged with &lt;code&gt;VLAN 10&lt;&#x2F;code&gt; to port 3, it will be
dropped because port 3 only accepts untagged packets.&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;h1 id=&quot;example-3-managed-switch-with-2-vlans&quot;&gt;Example 3: Managed Switch with 2 VLANs&lt;&#x2F;h1&gt;
&lt;p&gt;Things become more interesting when there are multiple VLANs. This is also the
reason why people create VLANs: to form many virtual (logical) switches out of a
single physical switch device. The seemingly complicated rules are also not
created for dropping packets. They are here to give the devices options to
choose which virtual switch it want a packet to be sent to.&lt;&#x2F;p&gt;
&lt;p&gt;In this example, let&#x27;s consider a switch that is configured to form two virtual
switches, one for &lt;code&gt;VLAN 10&lt;&#x2F;code&gt; and one for &lt;code&gt;VLAN 20&lt;&#x2F;code&gt;.&lt;&#x2F;p&gt;
&lt;pre class=&quot;z-code&quot;&gt;&lt;code&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;                          Ports
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;            1   2   3   4   5   6   7   8
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +---+---+---+---+---+---+---+---+
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;VLAN 10   | T | U | U |   |   |   |   |   |
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +---+---+---+---+---+---+---+---+
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;VLAN 20   |   |   | T | U |   |   |   |   |
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +---+---+---+---+---+---+---+---+
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          |   |   |   |   |   |   |   |   |
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +-|-+-|-+-|-+-|-+---+---+---+---+
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;            A   B   C   D
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;We can treat the switch as two separate virtual switches: the virtual switch for
&lt;code&gt;VLAN 10&lt;&#x2F;code&gt; consists of ports 1, 2, and 3, and the virtual switch for &lt;code&gt;VLAN 20&lt;&#x2F;code&gt;
consists of ports 3 and 4. Devices &lt;code&gt;A&lt;&#x2F;code&gt; and &lt;code&gt;B&lt;&#x2F;code&gt; are both only connected to the
virtual switch for VLAN 10, and can only communicate with each other and with
other devices on that same virtual switch. Device &lt;code&gt;D&lt;&#x2F;code&gt; is only connected to the
virtual switch for &lt;code&gt;VLAN 20&lt;&#x2F;code&gt;, and can only communicate with other devices on
that virtual switch. Device &lt;code&gt;C&lt;&#x2F;code&gt; is connected to both virtual switches through a
shared physical port, port 3.&lt;&#x2F;p&gt;
&lt;p&gt;When each device sends packets, it can now decide which virtual switch to send
them to by tagging the packets accordingly. Once a packet enters a virtual
switch, the rules that control how it leaves the switch &lt;strong&gt;remain the same&lt;&#x2F;strong&gt;.&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;Device &lt;code&gt;A&lt;&#x2F;code&gt; can only send packets to the virtual switch for &lt;code&gt;VLAN 10&lt;&#x2F;code&gt;, since
that is the only virtual switch that port 1 participates. In order to send
packets to the virtual switch for &lt;code&gt;VLAN 10&lt;&#x2F;code&gt;, device &lt;code&gt;A&lt;&#x2F;code&gt; must tag the packets
with &lt;code&gt;VLAN 10&lt;&#x2F;code&gt;, since port 1 is a &quot;tagged&quot; port for that virtual switch.&lt;&#x2F;li&gt;
&lt;li&gt;Similarly, device &lt;code&gt;B&lt;&#x2F;code&gt; can only send packets to the virtual switch for &lt;code&gt;VLAN 10&lt;&#x2F;code&gt;, but since port 2 is an &quot;untagged&quot; port for that virtual switch, device
&lt;code&gt;B&lt;&#x2F;code&gt; must send untagged packets to that virtual switch. Any other types of
packets sent from device B will be dropped.&lt;&#x2F;li&gt;
&lt;li&gt;Similarly, device &lt;code&gt;D&lt;&#x2F;code&gt; can only send packets to the virtual switch for &lt;code&gt;VLAN 20&lt;&#x2F;code&gt;, and the packets must be untagged to avoid being dropped.&lt;&#x2F;li&gt;
&lt;li&gt;Device &lt;code&gt;C&lt;&#x2F;code&gt; can choose to send packets to either virtual switch by tagging the
packets appropriately. Specifically, it can send untagged packets to the
virtual switch for &lt;code&gt;VLAN 10&lt;&#x2F;code&gt;, or packets tagged with &lt;code&gt;VLAN 20&lt;&#x2F;code&gt; to the virtual
switch for &lt;code&gt;VLAN 20&lt;&#x2F;code&gt;. Any other types of packets sent from device &lt;code&gt;C&lt;&#x2F;code&gt; will be
dropped.&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;h1 id=&quot;example-4-sharing-two-physical-ports&quot;&gt;Example 4: Sharing Two Physical Ports&lt;&#x2F;h1&gt;
&lt;p&gt;Now, let&#x27;s put what we&#x27;ve learned into practice! Consider the following slightly
different example below:&lt;&#x2F;p&gt;
&lt;pre class=&quot;z-code&quot;&gt;&lt;code&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;                          Ports
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;            1   2   3   4   5   6   7   8
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +---+---+---+---+---+---+---+---+
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;VLAN 10   | T | U | U | T |   |   |   |   |
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +---+---+---+---+---+---+---+---+
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;VLAN 20   |   |   | T | U |   |   |   |   |
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +---+---+---+---+---+---+---+---+
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          |   |   |   |   |   |   |   |   |
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +-|-+-|-+-|-+-|-+---+---+---+---+
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;            A   B   C   D
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;&lt;strong&gt;Question&lt;&#x2F;strong&gt;: If device &lt;code&gt;C&lt;&#x2F;code&gt; want to send a packet to device &lt;code&gt;D&lt;&#x2F;code&gt;, what can it do?&lt;&#x2F;p&gt;
&lt;p&gt;Device &lt;code&gt;C&lt;&#x2F;code&gt; is connected to port 1 and device &lt;code&gt;D&lt;&#x2F;code&gt; is connected to port 4. Both
port 3 and 4 participates in both virtual switches. This means that device &lt;code&gt;C&lt;&#x2F;code&gt;
has two choices:&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;Device &lt;code&gt;C&lt;&#x2F;code&gt; can send an untagged packet to device &lt;code&gt;D&lt;&#x2F;code&gt; via the virtual switch
for &lt;code&gt;VLAN 10&lt;&#x2F;code&gt;. This is possible because port 4 is also a member of the
virtual switch for &lt;code&gt;VLAN 10&lt;&#x2F;code&gt;. However, device &lt;code&gt;D&lt;&#x2F;code&gt; will actually receive the
packet tagged with &lt;code&gt;VLAN 10&lt;&#x2F;code&gt;, because port 4 is &quot;tagged&quot; for &lt;code&gt;VLAN 10&lt;&#x2F;code&gt;.&lt;&#x2F;li&gt;
&lt;li&gt;Alternatively, device &lt;code&gt;C&lt;&#x2F;code&gt; can send a packet tagged with &lt;code&gt;VLAN 20&lt;&#x2F;code&gt; to device
&lt;code&gt;D&lt;&#x2F;code&gt; via the virtual switch for &lt;code&gt;VLAN 20&lt;&#x2F;code&gt;. However, device &lt;code&gt;D&lt;&#x2F;code&gt; will actually
receive the packet untagged, because port 4 is &quot;untagged&quot; for VLAN 20.&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;p&gt;In this way, device &lt;code&gt;C&lt;&#x2F;code&gt; has the flexibility to decide not only which virtual
switch to use but also how the packet should be tagged upon reaching device &lt;code&gt;D&lt;&#x2F;code&gt;.&lt;&#x2F;p&gt;
&lt;h1 id=&quot;one-extra-rule-each-physical-port-can-only-be-untagged-once&quot;&gt;One Extra Rule: Each Physical Port Can Only Be &quot;Untagged&quot; Once&lt;&#x2F;h1&gt;
&lt;p&gt;The following configuration is &lt;strong&gt;invalid&lt;&#x2F;strong&gt; as port 3 is untagged for both &lt;code&gt;VLAN 10&lt;&#x2F;code&gt; and &lt;code&gt;VLAN 20&lt;&#x2F;code&gt;. Why?&lt;&#x2F;p&gt;
&lt;pre class=&quot;z-code&quot;&gt;&lt;code&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;                          Ports
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;            1   2   3   4   5   6   7   8
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +---+---+---+---+---+---+---+---+
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;VLAN 10   | T | U | U |   |   |   |   |   |
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +---+---+---+---+---+---+---+---+
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;VLAN 20   |   |   | U | U |   |   |   |   |
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +---+---+---+---+---+---+---+---+
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          |   |   |   |   |   |   |   |   |
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +-|-+-|-+-|-+-|-+---+---+---+---+
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;            A   B   C   D
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Because packets are not allowed to be duplicated and sent to multiple virtual
switches. Consider the case when device &lt;code&gt;C&lt;&#x2F;code&gt; sends an untagged packet to port 3.
It is &lt;strong&gt;undecideable&lt;&#x2F;strong&gt; whether it should go into the switch for &lt;code&gt;VLAN 10&lt;&#x2F;code&gt; or the
switch for &lt;code&gt;VLAN 20&lt;&#x2F;code&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;Therefore, a configuration where a physical port participates in multiple
virtual switches as an untagged port is not valid.&lt;&#x2F;p&gt;
&lt;h1 id=&quot;final-example-designing-an-one-armed-router&quot;&gt;Final Example: Designing an One Armed Router&lt;&#x2F;h1&gt;
&lt;p&gt;A common use case for VLANs is when you need to use a computer with a single
ethernet port as your router. Normally, a router should have at least two ports,
one for the uplink (the modem that your ISP provides) and one for the downlink
(the rest of your home devices, usually via a switch). In this example, we&#x27;ll
assume you want to connect two devices: a WiFi access point and a PC.&lt;&#x2F;p&gt;
&lt;p&gt;Normally a router should have at least two ports: one for connecting the uplink
(i.e. the modem that your ISP gives you) and the downlink (the rest of your home
devices, usually via a switch). In this example, let&#x27;s say we want to connect
two home devices: A WiFi AP and a PC.&lt;&#x2F;p&gt;
&lt;p&gt;To do this, you will need two switches and four ports. We can use &lt;code&gt;VLAN 10&lt;&#x2F;code&gt; to
connect the router and the uplink modem, and &lt;code&gt;VLAN 20&lt;&#x2F;code&gt; to connect the router,
WiFi AP, and PC.&lt;&#x2F;p&gt;
&lt;pre class=&quot;z-code&quot;&gt;&lt;code&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;                          Ports
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;            1   2   3   4   5   6   7   8
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +---+---+---+---+---+---+---+---+
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;VLAN 10   | U | T |   |   |   |   |   |   |
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +---+---+---+---+---+---+---+---+
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;VLAN 20   |   | T | U | U |   |   |   |   |
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +---+---+---+---+---+---+---+---+
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          |   |   |   |   |   |   |   |   |
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +-|-+-|-+-|-+-|-+---+---+---+---+
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          Modem |   |   |
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;                |   |   |
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;              Router|   PC 
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;                    |
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;                  WiFi AP
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;With this configuration, you can use the PC and WiFi AP simultaneously without
needing a multi-port router.&lt;&#x2F;p&gt;
&lt;h1 id=&quot;bonus-example-adding-multiple-wifi-networks-with-a-single-wifi-ap&quot;&gt;Bonus Example: Adding Multiple WiFi Networks with A Single WiFi AP&lt;&#x2F;h1&gt;
&lt;p&gt;In my case, I also need to set up 3 separate WiFi networks for personal devices,
IoT devices (i.e. smart home stuff) and for guests on a single WiFi AP. By using
VLANs to separate the personal, IoT, and guest networks, we can ensure that
devices on each network are isolated from each other, providing an extra layer
of security for our home network.&lt;&#x2F;p&gt;
&lt;p&gt;Fortunately my &lt;a rel=&quot;noopener&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;www.amazon.com&#x2F;Aruba-Instant-Indoor-Access-Point&#x2F;dp&#x2F;B07V3J5TXJ&quot;&gt;WiFi
AP&lt;&#x2F;a&gt;
supports VLAN tagging so that I can create &lt;code&gt;VLAN 30&lt;&#x2F;code&gt; and &lt;code&gt;VLAN 40&lt;&#x2F;code&gt; for the IoT
devices and guest network. This also means adding two more virtual switches to
connect the router and the WiFi AP.&lt;&#x2F;p&gt;
&lt;p&gt;A revised diagram is shown below.&lt;&#x2F;p&gt;
&lt;pre class=&quot;z-code&quot;&gt;&lt;code&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;                          Ports
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;            1   2   3   4   5   6   7   8
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +---+---+---+---+---+---+---+---+
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;VLAN 10   | U | T |   |   |   |   |   |   |   (Uplink)
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +---+---+---+---+---+---+---+---+
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;VLAN 20   |   | T | U | U |   |   |   |   |   (Personal Network)
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +---+---+---+---+---+---+---+---+
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;VLAN 30   |   | T | T |   |   |   |   |   |   (IoT Network)
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +---+---+---+---+---+---+---+---+
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;VLAN 40   |   | T | T |   |   |   |   |   |   (Guest Network)
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +---+---+---+---+---+---+---+---+
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          |   |   |   |   |   |   |   |   |
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          +-|-+-|-+-|-+-|-+---+---+---+---+
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;          Modem |   |   |
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;                |   |   |
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;              Router|   PC 
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;                    |
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;                  WiFi AP
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Thank you for reading and hope this post helps you!&lt;&#x2F;p&gt;
&lt;h1 id=&quot;acknowledgement&quot;&gt;Acknowledgement&lt;&#x2F;h1&gt;
&lt;p&gt;Special thanks to &lt;a rel=&quot;noopener&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;chat.openai.com&#x2F;chat&quot;&gt;ChatGPT&lt;&#x2F;a&gt;, an AI language model
trained by OpenAI, for helping me revise and improve this post.&lt;&#x2F;p&gt;
</content>
        
    </entry>
    <entry xml:lang="en">
        <title>Brief Notes on Tech Leading</title>
        <published>2021-07-05T00:00:00+00:00</published>
        <updated>2021-07-05T00:00:00+00:00</updated>
        
        <author>
          <name>
            
              Unknown
            
          </name>
        </author>
        
        <link rel="alternate" type="text/html" href="https://www.breakds.org/notes-on-management/"/>
        <id>https://www.breakds.org/notes-on-management/</id>
        
        <content type="html" xml:base="https://www.breakds.org/notes-on-management/">&lt;p&gt;For quite some time, I have been leading software engineering teams. This is not a piece of advice for current or prospective tech leads. In fact, I believe that most of you are better at managing a group than I am. Nonethelss, I wanted to convey what I have learned from this incredible journey in the hopes of inspiring some of the readers.&lt;&#x2F;p&gt;
&lt;p&gt;There are no rules that you can follow to reach the global optimum of software engineering managment, just as there are not rules for many other kinds of art in the world. Your strategy will most certainly be determined by the situation, the team, the individuals, the objectives, and sometimes even your own style as the leader.&lt;&#x2F;p&gt;
&lt;p&gt;However, there are certain areas of focus that I attempt to improve, and I have found that they have helped me construct a more successful and cohesive team. In general, I prefer to establish a team that can accomplish ambitious long-term goals rather than a team that can accomplish simple tasks quickly.&lt;&#x2F;p&gt;
&lt;p&gt;Engineers are most productive when they are inspired rather than forced. Do not simply ask them to do just what you want them to do.
Create an environment where the team is allowed to take risks, express their thoughts and try techinically demanding solutions freely. Mistakes help people grow, but punishment does not.
The majority of so called metrics are not well-defined. Such not-well-defined metrics are far worse than having none at all.
In terms of improving development performance, DevOps and Tooling are more significant than you may realize.
Or to summarize them in another way, try not to be a lazy manager.&lt;&#x2F;p&gt;
</content>
        
    </entry>
    <entry xml:lang="en">
        <title>Minimalist&#x27;s Kalman Filter Derivation, Part II</title>
        <published>2020-09-25T00:00:00+00:00</published>
        <updated>2020-09-25T00:00:00+00:00</updated>
        
        <author>
          <name>
            
              Unknown
            
          </name>
        </author>
        
        <link rel="alternate" type="text/html" href="https://www.breakds.org/kalman-filter-part-2/"/>
        <id>https://www.breakds.org/kalman-filter-part-2/</id>
        
        <content type="html" xml:base="https://www.breakds.org/kalman-filter-part-2/">&lt;h2 id=&quot;background&quot;&gt;Background&lt;&#x2F;h2&gt;
&lt;p&gt;As promised, in this post we will be deriving the multi-variate
version of Kalman Filter. It will be a bit more math intensive because
we are focusing on &lt;strong&gt;derivation&lt;&#x2F;strong&gt;, but similar to the previous post I
will try my best to make the equations intuitive and easily
understandable.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;bayes-filter&quot;&gt;Bayes Filter&lt;&#x2F;h2&gt;
&lt;p&gt;Kalman filter is actually a special form of Bayes filter. This means
that Bayes filter is actually solving a (slightly) more general
problem. We will first give a high-level overivew of Bayes Filter and
then add constraint to make it a Kalman filter problem.&lt;&#x2F;p&gt;
&lt;p&gt;The reason is that&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;Although being more general, Bayes filter is actually more
straight-forward to derive.&lt;&#x2F;li&gt;
&lt;li&gt;By understanding the connection between Kalman filter and Bayes
filter, it will give a much better picture of the great ideas
behind both of them.&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;p&gt;Unlike the previous post, we are looking at a system with
multi-variate state space ($n$ dimensional). This means that the
system is undergoing a series of states&lt;&#x2F;p&gt;
&lt;p&gt;$$
x_1, x_2, x_3, ..., x_t, x_{t+1}, ... \in \mathbb{R}^n
$$&lt;&#x2F;p&gt;
&lt;p&gt;Similarly, we are not able to directly observe the states. What we can
do is for each timestamp $t$, we can take a measurement to obtain the
observations&lt;&#x2F;p&gt;
&lt;p&gt;$$
y_1, y_2, y_3, ..., y_t, y_{t+1}, ... \in \mathbb{R}^m
$$&lt;&#x2F;p&gt;
&lt;p&gt;Note that here $m$ is not necessarily equal to $n$. Bayes filter aims
to solve the problem of estimating (the distribution of) $x_t$ given
the observed trajectory $y_{1..t}$, i.e. estimating the probability
density function (pdf):&lt;&#x2F;p&gt;
&lt;p&gt;$$
p\left(x_t \mid y_{1..t}\right) = ?
$$&lt;&#x2F;p&gt;
&lt;h3 id=&quot;solve-bayes-filter-problem&quot;&gt;Solve Bayes Filter Problem&lt;&#x2F;h3&gt;
&lt;p&gt;Bayes filter assumes that you know&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
p\left( x_t \mid x_{t-1} \right) &amp;amp; \textrm{the transition model}\\
p\left( y_t \mid x_t \right) &amp;amp; \textrm{the measurement model} \\
p \left( x_{t-1} \mid y_{1..t-1} \right) &amp;amp; \textrm{the previous state estimation}
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;p&gt;The first step is to obtain the estimation of $x_t$ purely based on
prediction (i.e. without the newest observation $y_t$). By applying
the &lt;strong&gt;transition model&lt;&#x2F;strong&gt; and the previous state estimation, we have:&lt;&#x2F;p&gt;
&lt;p&gt;$$
p\left( x_t \mid y_{1..t-1} \right) = \int_{x} p\left(x_t \mid x_{t-1} = x\right) \cdot p\left(x_{t-1} = x \mid y_{1..t-1} \right) \mathrm{d}x
$$&lt;&#x2F;p&gt;
&lt;p&gt;We then look at the &lt;strong&gt;posterior&lt;&#x2F;strong&gt;&lt;sup class=&quot;footnote-reference&quot;&gt;&lt;a href=&quot;#1&quot;&gt;1&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt;&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{darray}{rcl}
p \left( x_t \mid y_{1..t} \right) &amp;amp;=&amp;amp; \frac{p\left( x_t, y_t \mid y_{1..t-1}\right)}{p\left( y_t \mid y_{1..t-1}\right)} \\
&amp;amp;\propto&amp;amp; p\left( x_t, y_t \mid y_{1..t-1}\right) \\
&amp;amp;=&amp;amp; p \left(y_t \mid x_t \right) \cdot p\left( x_t \mid y_{1..t-1} \right)
\end{darray}
$$&lt;&#x2F;p&gt;
&lt;p&gt;Note that both terms on the RHS are known as they are just the
&lt;strong&gt;measurement model&lt;&#x2F;strong&gt; and the pure prediction estimation.&lt;&#x2F;p&gt;
&lt;div class=&quot;footnote-definition&quot; id=&quot;1&quot;&gt;&lt;sup class=&quot;footnote-definition-label&quot;&gt;1&lt;&#x2F;sup&gt;
&lt;p&gt;If you recognize it - yes we are applying Bayes inference here.&lt;&#x2F;p&gt;
&lt;&#x2F;div&gt;
&lt;p&gt;By obtaining the pdf $p\left( x_t \mid y_{1..t} \right)$ we derived
the estimated distribution for the current state at $t$. Therefore two
steps actually covered the Bayes filter. Yes it is just that simple
and straight-forward. $\blacksquare$&lt;&#x2F;p&gt;
&lt;h3 id=&quot;kalman-filter-is-a-special-bayes&quot;&gt;Kalman Filter Is a Special Bayes&lt;&#x2F;h3&gt;
&lt;p&gt;We say that Kalman filter is a special form of Bayes Filter because it
poses 3 constraints on Bayes filter, one for each of the known
conditions:&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;The transition model is assumed to be &lt;strong&gt;linear&lt;&#x2F;strong&gt; with &lt;strong&gt;Gaussian&lt;&#x2F;strong&gt;
error. This means that&lt;&#x2F;p&gt;
&lt;p&gt;$$
p \left( x_t \mid x_{t-1} \right) = \textrm{  pdf of } N( F_tx_{t-1}, Q_t)
$$&lt;&#x2F;p&gt;
&lt;p&gt;Here $F_t$ is a $n \times n$ matrix, and $Q_t$ is a $n \times n$
&lt;a rel=&quot;noopener&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Covariance_matrix&quot;&gt;covariance
matrix&lt;&#x2F;a&gt; describing
the error. The most important properties of the covariance matrices
are that they are&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;always symmetric&lt;&#x2F;li&gt;
&lt;li&gt;always &lt;a rel=&quot;noopener&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Definite_symmetric_matrix&quot;&gt;positive semi-definite&lt;&#x2F;a&gt;&lt;&#x2F;li&gt;
&lt;li&gt;always invertible&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;&#x2F;li&gt;
&lt;li&gt;
&lt;p&gt;The measurement model is also assumed &lt;strong&gt;linear&lt;&#x2F;strong&gt; with &lt;strong&gt;Gaussian&lt;&#x2F;strong&gt;
error.&lt;&#x2F;p&gt;
&lt;p&gt;$$
p\left( y_t \mid x_t \right) = \textrm{ pdf of } N(H_tx_t, R_t)
$$&lt;&#x2F;p&gt;
&lt;p&gt;Similarly, $H_t$ is a $m \times n$ matrix and $R_t$ is a $m \times
m$ covariance matrix.&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;li&gt;
&lt;p&gt;The estimated distribution of $x_t \mid y_{1..t}$ is assumed
Gaussian, i.e.&lt;&#x2F;p&gt;
&lt;p&gt;$$
p\left( x_t \mid y_{1..t} \right) = \textrm{ pdf of } N(\hat{x}_t, P_t)
$$&lt;&#x2F;p&gt;
&lt;p&gt;Here in the above formula&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;$\hat{x}_t$ is the estimated mean of the state at $t$.&lt;&#x2F;li&gt;
&lt;li&gt;$P_t$ is a $n \times n$ matrix representing the estimated
covariance of the state at $t$.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;p&gt;We can now follow Bayes Filter&#x27;s 2 steps to solve Kalman Filter.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;pure-prediction-in-kalman-filter&quot;&gt;Pure Prediction in Kalman Filter&lt;&#x2F;h2&gt;
&lt;p&gt;As shown above the first step is about computing&lt;&#x2F;p&gt;
&lt;p&gt;$$
p\left( x_t \mid y_{1..t-1} \right) = \int_{x} p\left(x_t \mid x_{t-1} = x\right) \cdot p\left(x_{t-1} = x \mid y_{1..t-1} \right) \mathrm{d}x
$$&lt;&#x2F;p&gt;
&lt;p&gt;We can continue to simplify it since we are dealing with Kalman filter
and we know that both pdfs involved in the RHS are &lt;strong&gt;Gaussian&lt;&#x2F;strong&gt;. Note
that if we take the &lt;strong&gt;generative model&lt;&#x2F;strong&gt; view of the above equation,
it actually tells&lt;&#x2F;p&gt;
&lt;blockquote&gt;
&lt;p&gt;The random variance $x_t|y_{1..t-1}$ is generated by&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;Sample $x_{t_1} \mid y_{1..t-1}$ from $N(\hat{x}_{t-1}, P_{t-1})$&lt;&#x2F;li&gt;
&lt;li&gt;Sample $e_t$ from $N(0, Q_t)$&lt;&#x2F;li&gt;
&lt;li&gt;Obtain $x_{t} \mid y_{1..t-1} = F_t \cdot (x_{t-1} \mid y_{1..t-1}) + e_t$&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;&#x2F;blockquote&gt;
&lt;p&gt;For ease of reading we are going to use $x_t$ as short for the
conditional variable $x_t \mid y_{1..t-1}$ and $x_{t-1}$ as short for
the conditional random variable $x_{t-1} \mid y_{1..t-1}$.&lt;&#x2F;p&gt;
&lt;p&gt;Recall that the moment generating function for a Gaussian distribution
$N(\mu, \Sigma)$ is&lt;&#x2F;p&gt;
&lt;p&gt;$$
g(k) = \mathbb{E} \left[ e^{k^\intercal x}\right] = \exp \left[ k^\intercal\mu + \frac{1}{2} k^\intercal \Sigma k \right]
$$&lt;&#x2F;p&gt;
&lt;p&gt;By applying this we can try to obtain the moment generating function
for the random variable $x_t \mid y_{1..t-1}$ by the following:&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{darray}{rcl}
g(k) &amp;amp;=&amp;amp; \mathbb{E} \left[ e^{k^\intercal x_t}\right] = \mathbb{E} \left[ e^{k^\intercal (F_t x_{t-1} + e_t)} \right] \\
&amp;amp;=&amp;amp; \mathbb{E} \left[ e^{k^\intercal F_t x_{t-1}} \right] \cdot \mathbb{E} \left[ e^{k^\intercal e^t} \right] \\
&amp;amp;=&amp;amp; \mathbb{E} \left[ e^{(F_{t}^{\intercal} k )^\intercal x_{t-1}} \right] \cdot \mathbb{E} \left[ e^{k^\intercal e^t} \right] \\
&amp;amp;=&amp;amp; \exp \left[ (F_{t}^{\intercal}k)^\intercal \hat{x}_{t-1} + \frac{1}{2} (F_{t}^{\intercal}k)^\intercal P_{t-1} (F_{t}^{\intercal}k) \right] \cdot \exp \left[ \frac{1}{2} k^\intercal Q_t k\right] \\
&amp;amp;=&amp;amp; \exp \left[ k^\intercal F_t \hat{x}_{t-1} + \frac{1}{2}k^\intercal \left( F_{t}^{\intercal}P_{t-1}F_{t} + Q_t \right)k \right]
\end{darray}
$$&lt;&#x2F;p&gt;
&lt;p&gt;Now it becomes super clear that $x_t \mid y_{1..t-1}$ also follows a
Gaussian distribution. In fact&lt;&#x2F;p&gt;
&lt;p&gt;$$
p\left(x_t \mid y_{1..t-1}\right) = \textrm{ pdf of } N(F_t\hat{x}_{t-1}, F_t^{\intercal} P_{t-1} F_t + Q_t)
$$&lt;&#x2F;p&gt;
&lt;p&gt;The mean and covariance determine the pure-prediction estimation.&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
x_{t}&#x27; &amp;amp;=&amp;amp; F_t \hat{x}_{t-1} &amp;amp; \textrm{pure prediction mean} \\
P_{t}&#x27; &amp;amp;=&amp;amp; F_t^\intercal P_{t-1} F_t + Q_t &amp;amp; \textrm{pure prediction covariance}
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;p&gt;Remember this and we will use them in the next section.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;posterior-in-kalman-filter&quot;&gt;Posterior in Kalman Filter&lt;&#x2F;h2&gt;
&lt;p&gt;The second step in Bayes filter is just to compute the actual
estimation called &lt;strong&gt;posterior&lt;&#x2F;strong&gt; with&lt;&#x2F;p&gt;
&lt;p&gt;$$
p \left( x_t \mid y_{1..t} \right) \propto p \left(y_t \mid x_t \right) \cdot p\left( x_t \mid y_{1..t-1} \right)
$$&lt;&#x2F;p&gt;
&lt;p&gt;You would probably think it is now straight-forward to simplify this
equation as we know both terms on the RHS are Gaussian pdfs, and their
parameters are known. While it is true that we can directly compute
the product of the two Gaussian pdfs, there are some complicated
matrix inversion that we will have deal with in that approach. To
avoid such complexity, we choose to estimate an &lt;strong&gt;auxilary&lt;&#x2F;strong&gt; random
variable&lt;&#x2F;p&gt;
&lt;p&gt;$$
Y = H_tx_t
$$&lt;&#x2F;p&gt;
&lt;p&gt;first. Note that $H_t$ is just a known constant matrix, and $Y$ is
basically a linear transformation of $x_t$. $Y$ has some physical
meaning as well - it is the supposed observation value if there are no
noise in the measurement. This also points out an important property
of $Y$:&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
r_t &amp;amp;=&amp;amp; y_t - Y &amp;amp; \textrm{the measurement noise random variable} \\
r_t &amp;amp;\sim&amp;amp; N(0, R_t) &amp;amp; \textrm{as we assume Gaussian error}
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;h3 id=&quot;relationship-between-y-and-x-t&quot;&gt;Relationship between $Y$ and $x_t$&lt;&#x2F;h3&gt;
&lt;p&gt;Since $Y$ is obtained by just applying a linear transformation on
$x_t$, it is obvious that if $x_t$ follows a Gaussian distribution,
$Y$ also does. Nonetheless we will try to derive it formally via
moment generating function.&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{darray}{rcl}
g_Y(k) &amp;amp;=&amp;amp; \mathbb{E}\left[ e^{k^\intercal Y}\right] = \mathbb{E}\left[ e^{k^\intercal H_t x_t}\right]
= \mathbb{E}\left[ e^{(H_t^\intercal k)^\intercal H_t x_t}\right] \\
&amp;amp;=&amp;amp; \exp \left[ (H_t^\intercal k)^\intercal \mu_{x_t} + \frac{1}{2} (H_t^\intercal k)^\intercal \Sigma_{x_t} (H_t^\intercal k)\right] \\
&amp;amp;=&amp;amp; \exp \left[ k^\intercal (H_t\mu_{x_t}) + \frac{1}{2} k^\intercal (H_t \Sigma_{x_t} H_t^\intercal) k\right]
\end{darray}
$$&lt;&#x2F;p&gt;
&lt;p&gt;The above equation shows that $Y \sim N(H_t\mu_{x_t}, H_t \Sigma_{x_t}
H_t^\intercal)$.&lt;&#x2F;p&gt;
&lt;p&gt;This actually tells &lt;strong&gt;two stories&lt;&#x2F;strong&gt;:&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;The &lt;strong&gt;pure prediction&lt;&#x2F;strong&gt; estimation of $Y$, i.e. $Y | y_{1..t-1}$ is
actually&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
Y | y_{1..t-1} &amp;amp;\sim&amp;amp; N(y&#x27;, S) \\
y&#x27; &amp;amp;=&amp;amp; H_tx_t&#x27; \\
S &amp;amp;=&amp;amp; H_tP_t&#x27;H_t^\intercal
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;li&gt;
&lt;p&gt;The same relationship holds for the final posterior estimation of
$Y$ (i.e. $Y | y_{1..t}$) and $x_t$ (i.e. $x_t | y_{1..t}$)&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
x_t | y_{1..t} &amp;amp;\sim&amp;amp; N(\hat{x}_t, P_t) \\
Y | y_{1..t} &amp;amp;\sim&amp;amp; N(H_t\hat{x}_t, H_tP_tH_t^\intercal)
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;h3 id=&quot;derivation-of-the-posterior-of-y&quot;&gt;Derivation of the Posterior of $Y$&lt;&#x2F;h3&gt;
&lt;p&gt;Rewrite the posterior equation above so that it is w.r.t. $Y$, we have&lt;&#x2F;p&gt;
&lt;p&gt;$$
p \left( Y = y\mid y_{1..t} \right) \propto p \left(y_t \mid Y = y\right) \cdot p\left( Y = y\mid y_{1..t-1} \right)
$$&lt;&#x2F;p&gt;
&lt;p&gt;Let us take a closer look at the LHS. The first term&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{darray}{rcl}
p \left(y_t \mid Y \right) &amp;amp;=&amp;amp; p \left(r_t = y_t - y \right) \\
&amp;amp;=&amp;amp; \textrm{const} \cdot \exp \left[-\frac{1}{2} (y_t - y)^\intercal R_t (y_t - y)\right] \\
&amp;amp;=&amp;amp; \textrm{const} \cdot \exp \left[-\frac{1}{2} (y - y_t)^\intercal R_t (y - y_t)\right] \\
&amp;amp;=&amp;amp; \textrm{pdf of } N(y_t, R_t) \textrm{ w.r.t. } Y
\end{darray}
$$&lt;&#x2F;p&gt;
&lt;p&gt;And the second term is the pure prediction estimation of $Y$ which we
have already derived in the previous subsection. It is&lt;&#x2F;p&gt;
&lt;p&gt;$$
p\left( Y = y\mid y_{1..t-1} \right) = \textrm{ pdf of } N(H_tx_t&#x27;, H_tP_t&#x27;H_t^\intercal) \textrm{ w.r.t. } Y
$$&lt;&#x2F;p&gt;
&lt;p&gt;For ease of reading let&#x27;s denote both of them as&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
N(y_t, R_t)_Y &amp;amp;=&amp;amp; \textrm{pdf of } N(y_t, R_t) \textrm{ w.r.t. } Y \\
N(H_tx_t&#x27;, H_tP_t&#x27;H_t^\intercal)_Y &amp;amp;=&amp;amp; \textrm{ pdf of } N(H_tx_t&#x27;, H_tP_t&#x27;H_t^\intercal) \textrm{ w.r.t. } Y
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;p&gt;Back to the posterior of $Y$, we can now see&lt;&#x2F;p&gt;
&lt;p&gt;$$
p \left( Y = y\mid y_{1..t} \right) \propto N(y_t, R_t)_Y \cdot N(H_tx_t&#x27;, H_tP_t&#x27;H_t^\intercal)_Y
$$&lt;&#x2F;p&gt;
&lt;p&gt;Okay so the posterior pdf is actually the product of two Gaussian
pdfs. We now need to apply our Lemma III (proof in the Appendix),
which says the product of two Guassian pdfs with parameters $\mu_1$,
$\Sigma_1$, $\mu_2$ and $\Sigma_2$ is the pdf of an &lt;strong&gt;unnormalized&lt;&#x2F;strong&gt;
Guassian, s.t.&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
\Sigma &amp;amp;=&amp;amp; \Sigma_2 - K\Sigma_2 \\
\mu &amp;amp;=&amp;amp; \mu_2 + K (\mu_1 - \mu_2)
\end{cases}
\textrm{ , where } K = \Sigma_2(\Sigma_1 + \Sigma_2)^{-1}
$$&lt;&#x2F;p&gt;
&lt;p&gt;(everyone is encouraged to read the appendix for the proofs of all the
lemmas, as they are rather simple).&lt;&#x2F;p&gt;
&lt;p&gt;Now plugin what we have here&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
\Sigma_1 &amp;amp;=&amp;amp; R_t &amp;amp;\textrm{ and }&amp;amp; \mu_1 &amp;amp;=&amp;amp; y_t \\
\Sigma_2 &amp;amp;=&amp;amp;  H_tP_t&#x27;H_t &amp;amp;\textrm{ and }&amp;amp; \mu_2 &amp;amp;=&amp;amp; H_tx_t&#x27;
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;p&gt;which gives parameters for posterior distrition of $Y|y_{1..t}$&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
K_Y &amp;amp;=&amp;amp; H_tP_t&#x27;H_t^\intercal(R_t + H_tP_t&#x27;H_t^\intercal)^{-1} \\
\Sigma_Y &amp;amp;=&amp;amp; H_tP_t&#x27;H_t^\intercal - K H_tP_t&#x27;H_t^\intercal \\
&amp;amp;=&amp;amp; H_tP_t&#x27;H_t^\intercal - H_tP_t&#x27;H_t^\intercal(R_t + H_tP_t&#x27;H_t^\intercal)^{-1} H_tP_t&#x27;H_t^\intercal \\
\mu_Y &amp;amp;=&amp;amp; H_tP_t&#x27; + K (y_t - H_tx_t&#x27;) \\
&amp;amp;=&amp;amp; H_tx_t&#x27; + H_tP_t&#x27;H_t^\intercal(R_t + H_tP_t&#x27;H_t^\intercal)^{-1}(y_t - H_tx_t&#x27;)
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;h3 id=&quot;derivation-of-the-posterior-of-x-t&quot;&gt;Derivation of the Posterior of $x_t$&lt;&#x2F;h3&gt;
&lt;p&gt;Right, the above &lt;strong&gt;does&lt;&#x2F;strong&gt; look complicated. But you do not have to
remember this and we do this intentionally so that we can achieve our
final goal of estimating the posterior distribution of $x_t$. We now
go back to look at the last equation of the previous subsection:&lt;&#x2F;p&gt;
&lt;p&gt;$$
Y | y_{1..t} \sim N(H_t\hat{x}_t, H_tP_tH_t^\intercal)
$$&lt;&#x2F;p&gt;
&lt;p&gt;It is straight-forward to see that&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
H_tP_tH_t^\intercal &amp;amp;=&amp;amp; H_tP_t&#x27;H_t^\intercal - H_tP_t&#x27;H_t^\intercal(R_t + H_tP_t&#x27;H_t^\intercal)^{-1} H_tP_t&#x27;H_t^\intercal \\
H_t\hat{x}_t &amp;amp;=&amp;amp; H_tx_t&#x27; + H_tP_t&#x27;H_t^\intercal(R_t + H_tP_t&#x27;H_t^\intercal)^{-1}(y_t - H_tx_t&#x27;)
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;p&gt;Note that this holds for whatever $H_t$ we put there. Therfore by
striping $H_t$ for both sides we have derived the parameters for the
posterior estimation of $x_t$:&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
P_t &amp;amp;=&amp;amp; P_t&#x27; - P_t&#x27;H_t^\intercal(R_t + H_tP_t&#x27;H_t^\intercal)^{-1} H_tP_t&#x27; \\
\hat{x}_t &amp;amp;=&amp;amp; x_t&#x27; + P_t&#x27;H_t^\intercal(R_t + H_tP_t&#x27;H_t^\intercal)^{-1}(y_t - H_tx_t&#x27;)
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;p&gt;In fact, we can further simplify it by defining $K$ (which is usually
called the &lt;a rel=&quot;noopener&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;dsp.stackexchange.com&#x2F;questions&#x2F;2347&#x2F;how-to-understand-kalman-gain-intuitively&quot;&gt;Kalman gain&lt;&#x2F;a&gt;)&lt;&#x2F;p&gt;
&lt;p&gt;$$
K = P_t&#x27;H_t^\intercal(R_t + H_tP_t&#x27;H_t^\intercal)^{-1}
$$&lt;&#x2F;p&gt;
&lt;p&gt;and the posterior estimation of $x_t$ can then be simplified as&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
P_t &amp;amp;=&amp;amp; P_t&#x27; - KH_tP_t&#x27; \\
\hat{x}_t &amp;amp;=&amp;amp; x_t&#x27; + K(y_t - H_tx_t&#x27;)
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;p&gt;That concludes the derivation of multi-variate Kalman filter. $\blacksquare$&lt;&#x2F;p&gt;
&lt;h2 id=&quot;summary&quot;&gt;Summary&lt;&#x2F;h2&gt;
&lt;p&gt;Based on the derivation, the Kalman filter can be used to obtain the
posterior estimation following the Bayes filter&#x27;s approach. The steps
are&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Compute the pure prediction estimation paramters&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
x_t&#x27; &amp;amp;=&amp;amp; F_t \hat{x}_{t-1} \\
P_t&#x27; &amp;amp;=&amp;amp; F_tP_{t-1}F_t^\intercal + Q_t
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;li&gt;
&lt;p&gt;Compute the Kalman gain $K$&lt;&#x2F;p&gt;
&lt;p&gt;$$
K = P_t&#x27;H_t^\intercal(R_t + H_tP_t&#x27;H_t^\intercal)^{-1}
$$&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;li&gt;
&lt;p&gt;Compute the posterior&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
P_t &amp;amp;=&amp;amp; P_t&#x27; - KH_tP_t&#x27; \\
\hat{x}_t &amp;amp;=&amp;amp; x_t&#x27; + K(y_t - H_tx_t&#x27;)
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;h1 id=&quot;appendix&quot;&gt;Appendix&lt;&#x2F;h1&gt;
&lt;p&gt;I want the post to be very self-contained. Therefore I prepared 3
lemmas in the appendix so that main article won&#x27;t be too distracting.
All the 3 lemmas are about the product of two Gaussian pdfs, and it is
suggested that you read them in order.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;appendix-lemma-i&quot;&gt;Appendix -  Lemma I&lt;&#x2F;h2&gt;
&lt;p&gt;The product of 2 scalar Gaussian pdfs is an &lt;strong&gt;unormalized&lt;&#x2F;strong&gt; pdf of
another scalar Gaussian. To be more specific, if&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{darray}{rcl}
f(x) &amp;amp;=&amp;amp; f_1(x)f_2(x) \textrm{, where } \\
f_1(x) &amp;amp;=&amp;amp; \textrm{pdf of } N(\mu_1, \sigma_1^2) \\
f_2(x) &amp;amp;=&amp;amp; \textrm{pdf of } N(\mu_2, \sigma_2^2)
\end{darray}
$$&lt;&#x2F;p&gt;
&lt;p&gt;then $f(x)$ is the pdf of $N(\mu, \sigma^2)$ with&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
\sigma^2 &amp;amp;=&amp;amp; \left(\frac{1}{\sigma_1^2} + \frac{1}{\sigma_2^2}\right)^{-1} \\
\mu &amp;amp;=&amp;amp; \frac{\sigma^2}{\sigma_1^2}\mu_1 +\frac{\sigma^2}{\sigma_2^2}\mu_2
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;p&gt;&lt;strong&gt;proof:&lt;&#x2F;strong&gt;&lt;&#x2F;p&gt;
&lt;p&gt;Expand the pdf $f_1$ and $f_2$, we have&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
f_1(x) &amp;amp;=&amp;amp; \textrm{const} \cdot \exp \left[ -\frac{(x - \mu_1)^2}{2 \sigma_1^2} \right] \\
f_2(x) &amp;amp;=&amp;amp; \textrm{const} \cdot \exp \left[ -\frac{(x - \mu_2)^2}{2 \sigma_2^2} \right] \\
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;p&gt;Therefore&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{darray}{rcl}
f(x) &amp;amp;=&amp;amp; f_1(x) \cdot f_2(x) \\
&amp;amp;=&amp;amp; \textrm{const} \cdot \exp \left[ -\frac{(x - \mu_1)^2}{2 \sigma_1^2} -\frac{(x - \mu_2)^2}{2 \sigma_2^2} \right] \\
&amp;amp;=&amp;amp; \textrm{const} \cdot \exp -\frac{1}{2}\left[ \left(\frac{1}{\sigma_1^2} + \frac{1}{\sigma_2^2}\right) x^2 - 2\left(\frac{\mu_1}{\sigma_1^2} + \frac{\mu_2}{\sigma_2^2}\right)x\right]
\end{darray}
$$&lt;&#x2F;p&gt;
&lt;p&gt;Therefore we know that it must be an unnormalized Gaussian form.
Assume the parameters are $\mu$ and $\sigma^2$, we can then force&lt;&#x2F;p&gt;
&lt;p&gt;$$
\frac{(x - \mu)^2}{\sigma^2} = \left(\frac{1}{\sigma_1^2} + \frac{1}{\sigma_2^2}\right) x^2 - 2\left(\frac{\mu_1}{\sigma_1^2} + \frac{\mu_2}{\sigma_2^2}\right)x + \textrm{const}
$$&lt;&#x2F;p&gt;
&lt;p&gt;which gives&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
\frac{1}{\sigma^2} &amp;amp;=&amp;amp; \frac{1}{\sigma_1^2} + \frac{1}{\sigma_2^2} \\
\frac{\mu}{\sigma^2} &amp;amp;=&amp;amp; \frac{\mu_1}{\sigma_1^2} + \frac{\mu_2}{\sigma_2^2}
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;p&gt;Solve it and we get&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
\sigma^2 &amp;amp;=&amp;amp; \left(\frac{1}{\sigma_1^2} + \frac{1}{\sigma_2^2}\right)^{-1} \\
\mu &amp;amp;=&amp;amp; \frac{\sigma^2}{\sigma_1^2}\mu_1 +\frac{\sigma^2}{\sigma_2^2}\mu_2
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;p&gt;This concludes the proof. $\blacksquare$&lt;&#x2F;p&gt;
&lt;h2 id=&quot;appendix-lemma-ii&quot;&gt;Appendix -  Lemma II&lt;&#x2F;h2&gt;
&lt;p&gt;Lemma II is just the multi-variate version of Lemma I.&lt;&#x2F;p&gt;
&lt;p&gt;The product of 2 &lt;strong&gt;multi-variate&lt;&#x2F;strong&gt; Gaussian pdfs of $n$ dimension is
an &lt;strong&gt;unormalized&lt;&#x2F;strong&gt; pdf of another multi-varite $n$ dimension Gaussian.
To be more specific, if&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{darray}{rcl}
f(x) &amp;amp;=&amp;amp; f_1(x)f_2(x) \textrm{, where } \\
f_1(x) &amp;amp;=&amp;amp; \textrm{pdf of } N(\mu_1, \Sigma_1) \\
f_2(x) &amp;amp;=&amp;amp; \textrm{pdf of } N(\mu_2, \Sigma_2)
\end{darray}
$$&lt;&#x2F;p&gt;
&lt;p&gt;then $f(x)$ is the pdf of $N(\mu, \Sigma)$ with&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
\Sigma &amp;amp;=&amp;amp; (\Sigma_1^{-1} + \Sigma_2^{-1})^{-1} \\
\mu &amp;amp;=&amp;amp; \Sigma \Sigma_1^{-1} \mu_1 + \Sigma \Sigma_2^{-1} \mu_2
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;p&gt;Note that Lemma II is awfully similar to Lemma I.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;strong&gt;proof:&lt;&#x2F;strong&gt;&lt;&#x2F;p&gt;
&lt;p&gt;We still do the expand first, which gives&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
f_1(x) &amp;amp;=&amp;amp; \textrm{const} \cdot \exp \left[ -\frac{1}{2} (x-\mu_1)^\intercal \Sigma_1^{-1} (x - \mu_1) \right] \\
f_2(x) &amp;amp;=&amp;amp; \textrm{const} \cdot \exp \left[ -\frac{1}{2} (x-\mu_2)^\intercal \Sigma_2^{-1} (x - \mu_2) \right]
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;p&gt;Plug them into $f(x)$, we have&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{darray}{rcl}
f(x) &amp;amp;=&amp;amp; f_1(x) \cdot f_2(x) \\
&amp;amp;=&amp;amp; \textrm{const} \cdot \exp -\frac{1}{2} \left[ x^\intercal(\Sigma_1^{-1} + \Sigma_2^{-1})x - 2x^\intercal (\Sigma_1^{-1}\mu_1 + \Sigma_2^{-1}\mu_2)\right]
\end{darray}
$$&lt;&#x2F;p&gt;
&lt;p&gt;This shows that $f(x)$ is the pdf of a Gaussian. Similarly assume the
parameters of the Gaussian is $\mu$ and $\Sigma$, we can force&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{darray}{rcl}
&amp;amp;&amp;amp;x^\intercal(\Sigma_1^{-1} + \Sigma_2^{-1})x - 2x^\intercal (\Sigma_1^{-1}\mu_1 + \Sigma_2^{-1}\mu_2)  \\
&amp;amp;=&amp;amp; (x - \mu)^\intercal \Sigma^{-1} (x - \mu) \\
&amp;amp;=&amp;amp; x^\intercal \Sigma^{-1} x - 2 x^\intercal \Sigma^{-1} \mu + \textrm{const}
\end{darray}
$$&lt;&#x2F;p&gt;
&lt;p&gt;This gives the equation&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
\Sigma^{-1} &amp;amp;=&amp;amp; \Sigma_1^{-1} + \Sigma_2^{-1} \\
\Sigma^{-1}\mu &amp;amp;=&amp;amp; \Sigma_1^{-1} \mu_1 + \Sigma_2^{-1} \mu_2
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;p&gt;Solve it and we have&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
\Sigma &amp;amp;=&amp;amp; (\Sigma_1^{-1} + \Sigma_2^{-1})^{-1} \\
\mu &amp;amp;=&amp;amp; \Sigma \Sigma_1^{-1} \mu_1 + \Sigma \Sigma_2^{-1} \mu_2
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;p&gt;This concludes the proof of Lemma II. $\blacksquare$&lt;&#x2F;p&gt;
&lt;h2 id=&quot;appendix-lemma-iii&quot;&gt;Appendix - Lemma III&lt;&#x2F;h2&gt;
&lt;p&gt;Lemma III further simplifies Lemma II with a transformation. It states
that the solution of Lemma II can be rewritten as&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
\Sigma &amp;amp;=&amp;amp; \Sigma_2 - K\Sigma_2 \\
\mu &amp;amp;=&amp;amp; \mu_2 + K(\mu_1 - \mu_2)
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;p&gt;where&lt;&#x2F;p&gt;
&lt;p&gt;$$
K = \Sigma_2(\Sigma_1 + \Sigma_2)^{-1}
$$&lt;&#x2F;p&gt;
&lt;p&gt;&lt;strong&gt;proof:&lt;&#x2F;strong&gt;&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;We first apply some transformation on $\Sigma$.&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{darray}{rcl}
\Sigma^{-1} &amp;amp;=&amp;amp; \Sigma_1^{-1} + \Sigma_2^{-1} \\
&amp;amp;=&amp;amp; \Sigma_1^{-1}\Sigma_1(\Sigma_1^{-1} + \Sigma_2^{-1})\Sigma_2\Sigma_2^{-1} \\
&amp;amp;=&amp;amp; \Sigma_1^{-1} \left[ \Sigma_1(\Sigma_1^{-1} + \Sigma_2^{-1})\Sigma_2 \right] \Sigma_2^{-1} \\
&amp;amp;=&amp;amp; \Sigma_1^{-1} (\Sigma_1 + \Sigma_2) \Sigma_2^{-1} \\
\end{darray}
$$&lt;&#x2F;p&gt;
&lt;p&gt;take the inverse on both sides, we have&lt;&#x2F;p&gt;
&lt;p&gt;$$
\Sigma = \Sigma_2(\Sigma_1 + \Sigma_2)^{-1}\Sigma_1 = \Sigma_1(\Sigma_1 + \Sigma_2)^{-1}\Sigma_2
$$&lt;&#x2F;p&gt;
&lt;p&gt;Note that we implicitly used the property that covariance matrices
and their inverses are &lt;strong&gt;symmetric&lt;&#x2F;strong&gt;.&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;li&gt;
&lt;p&gt;We then apply some transformation on $\mu$.&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{darray}{rcl}
\mu &amp;amp;=&amp;amp; \Sigma \Sigma_1^{-1}\mu_1 + \Sigma \Sigma_2^{-1}\mu_2 \\
&amp;amp;=&amp;amp; \Sigma_2(\Sigma_1 + \Sigma_2)^{-1}\Sigma_1 \Sigma_1^{-1}\mu_1 + \Sigma_1(\Sigma_1 + \Sigma_2)^{-1}\Sigma_2 \Sigma_2^{-1}\mu_2 \\
&amp;amp;=&amp;amp; \Sigma_2(\Sigma_1 + \Sigma_2)^{-1}\mu_1 + \Sigma_1(\Sigma_1 + \Sigma_2)^{-1}\mu_2 \\
&amp;amp;=&amp;amp; \Sigma_2(\Sigma_1 + \Sigma_2)^{-1}\mu_1 + (\Sigma_1 + \Sigma_2 - \Sigma_2)(\Sigma_1 + \Sigma_2)^{-1}\mu_2 \\
&amp;amp;=&amp;amp; \Sigma_2(\Sigma_1 + \Sigma_2)^{-1}\mu_1 +  \mu_2 -\Sigma_2(\Sigma_1 + \Sigma_2)^{-1}\mu_2 \\
&amp;amp;=&amp;amp; \mu_2 + \Sigma_2(\Sigma_1 + \Sigma_2)^{-1}(\mu_1 - \mu_2)
\end{darray}
$$&lt;&#x2F;p&gt;
&lt;p&gt;It is now clear that if we define $K = \Sigma_2(\Sigma_1 + \Sigma_2)^{-1}$, then&lt;&#x2F;p&gt;
&lt;p&gt;$$
\mu = \mu_2 + K(\mu_1 - \mu_2)
$$&lt;&#x2F;p&gt;
&lt;p&gt;This proves the half of Lemma III.&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;li&gt;
&lt;p&gt;Let us take another look at $\Sigma$.&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{darray}{rcl}
\Sigma &amp;amp;=&amp;amp; \Sigma_2(\Sigma_1 + \Sigma_2)^{-1}\Sigma_1 \\
&amp;amp;=&amp;amp; K\Sigma_1 \\
&amp;amp;=&amp;amp; K(\Sigma_1 + \Sigma_2 - \Sigma_2) \\
&amp;amp;=&amp;amp; K(\Sigma_1 + \Sigma_2) - K\Sigma_2) \\
&amp;amp;=&amp;amp; \Sigma_2(\Sigma_1 + \Sigma_2)^{-1}(\Sigma_1 + \Sigma_2) - K\Sigma_2) \\
&amp;amp;=&amp;amp; \Sigma_2 - K\Sigma_2
\end{darray}
$$&lt;&#x2F;p&gt;
&lt;p&gt;The above proves the second half of Lemma III.&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;p&gt;Therefore the proof is concluded. $\blacksquare$&lt;&#x2F;p&gt;
</content>
        
    </entry>
    <entry xml:lang="en">
        <title>Minimalist&#x27;s Kalman Filter Derivation, Part I</title>
        <published>2020-08-03T00:00:00+00:00</published>
        <updated>2020-08-03T00:00:00+00:00</updated>
        
        <author>
          <name>
            
              Unknown
            
          </name>
        </author>
        
        <link rel="alternate" type="text/html" href="https://www.breakds.org/kalman-filter-part-1/"/>
        <id>https://www.breakds.org/kalman-filter-part-1/</id>
        
        <content type="html" xml:base="https://www.breakds.org/kalman-filter-part-1/">&lt;h2 id=&quot;motivation&quot;&gt;Motivation&lt;&#x2F;h2&gt;
&lt;p&gt;State estimation has many applications in general robotics, for
example autonomous driving localization and environment prediction.
Kalman filter is a classical yet powerful algorithm that tackles such
problem beautifully. Although there are already many articles,
textbooks and papers on how to derive the algorithm, I found most of
them too heavy on the theoretical side and might be hard for a
first-time learner who comes from an engineering background to follow.
Therefore, I would shamelessly attempt to fill this hole with a series
of posts.&lt;&#x2F;p&gt;
&lt;p&gt;This is going to be the first post of the series that only focuses on
the 1 dimensional case. Future posts will talk about the multi-variate
version of Kalman filter.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;strong&gt;Spoiler:&lt;&#x2F;strong&gt; there will be a lot of math equations. But rest assured,
nothing will exceed the level of basical calculus arithmetics.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;single-step-prediction&quot;&gt;Single Step Prediction&lt;&#x2F;h2&gt;
&lt;p&gt;Let&#x27;s say we have a one dimensional linear Markovian System, whose
transition function is know. This means&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;The state of the system can be represented as a single scalar.
Let&#x27;s denote it as $x$.&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;li&gt;
&lt;p&gt;The transition function is a linear function, where the next state
only depends on the current state. Therefore we can write the
transition function as&lt;&#x2F;p&gt;
&lt;p&gt;$$
x_{t+1} = a  \cdot x_t
$$&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;p&gt;Suppose you have an estimation of $x_t$ in the form of a Gaussian
distribution&lt;&#x2F;p&gt;
&lt;p&gt;$$
x_t \sim N(\hat{x}_t, \sigma_t^2)
$$&lt;&#x2F;p&gt;
&lt;p&gt;Based on that, what is your best-effort guess about the next state
$x_{t+1}$? First of all, the reason that we &lt;strong&gt;can&lt;&#x2F;strong&gt; make such a
prediction of the next state is because the transition function
actually reveals the relationship between $x_t$ and $x_{t+1}$, which
happens to be &lt;strong&gt;linear&lt;&#x2F;strong&gt; in this case. I know that &lt;strong&gt;intuitively&lt;&#x2F;strong&gt;,
you would guess the answer immediately:&lt;&#x2F;p&gt;
&lt;p&gt;$$
x_{t+1} \sim N(a\hat{x}_t, a^2\sigma_t^2)
$$&lt;&#x2F;p&gt;
&lt;p&gt;And that is the correct answer. But how do you prove that? Or more
generally, if we have a random variable $X \sim N(\mu, \sigma^2)$, can
we prove that another random variable that satisfies $Y = aX$ actually
follows the distribution $Y \sim N(a\mu, a^2\sigma^2)$?&lt;&#x2F;p&gt;
&lt;p&gt;Let&lt;&#x2F;p&gt;
&lt;p&gt;$$
\phi(x) = \frac{1}{\sqrt{2\pi}} \cdot e^{-\frac{1}{2}x^2}
$$&lt;&#x2F;p&gt;
&lt;p&gt;be the &lt;strong&gt;pdf&lt;&#x2F;strong&gt; (probability density function) of a standard gaussian
distribution $N(0, 1)$. It is easy to derive that the &lt;strong&gt;pdf&lt;&#x2F;strong&gt; of a
general gaussian distribution $N(\mu, \sigma^2)$ that $X$ follows
would be&lt;&#x2F;p&gt;
&lt;p&gt;$$
f_X(x) = \frac{1}{\sqrt{2\pi}\sigma} \cdot e^{-\frac{1}{2\sigma^2}(x-\mu)^2} = \frac{1}{\sigma}\phi \left( \frac{x - \mu}{\sigma} \right)
$$&lt;&#x2F;p&gt;
&lt;p&gt;Using the trick called &lt;a rel=&quot;noopener&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Differential_form&quot;&gt;differential form&lt;&#x2F;a&gt;, the probability of $X$ taking a specific value $x$ is&lt;sup class=&quot;footnote-reference&quot;&gt;&lt;a href=&quot;#1&quot;&gt;1&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt;&lt;&#x2F;p&gt;
&lt;p&gt;$$
\mathbb{P} [X = x] = f_X(x) \mathrm{d}x = \frac{1}{\sigma}\phi \left( \frac{x - \mu}{\sigma} \right) \mathrm{d} x
$$&lt;&#x2F;p&gt;
&lt;p&gt;Okay, so what does &lt;strong&gt;pdf&lt;&#x2F;strong&gt; of $Y$ (i.e. $\mathbb{P}(Y = y)$) look
like? It turns out that we can easily derive that with a bit
transformation:&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{darray}{rcl}
\mathbb{P} \left[ Y = y \right] &amp;amp;=&amp;amp; \mathbb{P} \left[ X = \frac{y}{a} \right] \\\
&amp;amp;=&amp;amp; f_X \left(\frac{y}{a}\right) \mathrm{d}x \\\
&amp;amp;=&amp;amp; f_X \left(\frac{y}{a}\right) \frac{\mathrm{d}y}{a} \\\
&amp;amp;=&amp;amp; \frac{1}{\sigma}\phi \left( \frac{\frac{y}{a} - \mu}{\sigma} \right) \frac{\mathrm{d}y}{a} \\\
&amp;amp;=&amp;amp; \frac{1}{a\sigma}\phi \left( \frac{y - a\mu}{a\sigma} \right) \mathrm{d}y
\end{darray}
$$&lt;&#x2F;p&gt;
&lt;p&gt;This basically shows that $Y$&#x27;s &lt;strong&gt;pdf&lt;&#x2F;strong&gt; is nothing but the &lt;strong&gt;pdf&lt;&#x2F;strong&gt; of
$N(a\mu, a^2\sigma^2)$, hence conclude the proof.&lt;&#x2F;p&gt;
&lt;div class=&quot;footnote-definition&quot; id=&quot;1&quot;&gt;&lt;sup class=&quot;footnote-definition-label&quot;&gt;1&lt;&#x2F;sup&gt;
&lt;p&gt;An intuitve perspective here that helps understanding that is $f_X(x) \mathrm{d}x$ is actually the area, whose &lt;strong&gt;fundamental unit&lt;&#x2F;strong&gt; is probability!&lt;&#x2F;p&gt;
&lt;&#x2F;div&gt;
&lt;p&gt;The above proved conclusion enables us to solve the prediction problem
at the very beginning of this section,&lt;&#x2F;p&gt;
&lt;p&gt;$$
\textrm{based on } x_{t+1} = a  \cdot x_t \textrm{ and } x_t \sim N(\hat{x}_t, \sigma_t^2)
$$&lt;&#x2F;p&gt;
&lt;p&gt;$$
\textrm{we can predict } x_{t+1} \sim N(a\hat{x}_t, a^2\sigma_t^2)
$$&lt;&#x2F;p&gt;
&lt;p&gt;equivalently, this means we can obtain estimation of $x_{t+1}$ (without observing it) as&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
\hat{x}_{t+1} &amp;amp;=&amp;amp; a\hat{x}_t \\
\sigma_{t+1} &amp;amp;=&amp;amp; a^2 \sigma_t^2
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;h2 id=&quot;uncertainty-in-transition-function&quot;&gt;Uncertainty in Transition Function&lt;&#x2F;h2&gt;
&lt;p&gt;Now it is time to introduce one variation on top of the simplest case
we discussed in the previous section. In reality the transition is
usually not perfect, which means that there is an error associated
with it. Mathematically, it means&lt;&#x2F;p&gt;
&lt;p&gt;$$
x_{t+1} = a \cdot x_t + e_t
$$&lt;&#x2F;p&gt;
&lt;p&gt;As usual for simplicity we assume the error is a zero-mean random
variable follows a Gaussian distribution, i.e.&lt;&#x2F;p&gt;
&lt;p&gt;$$
e_t \sim N \left(0, \sigma_{e_t}^2 \right)
$$&lt;&#x2F;p&gt;
&lt;p&gt;How should we revise our prediction under such condition? Remember we
are still solving the following question - if we already have an
estimation of $x_t$ as&lt;&#x2F;p&gt;
&lt;p&gt;$$
x_t \sim N(\hat{x}_t, \sigma_t^2)
$$&lt;&#x2F;p&gt;
&lt;p&gt;what is a good estimation of $x_{t+1}$, given that we know (although
not precisely in this case) the transition function?&lt;&#x2F;p&gt;
&lt;p&gt;Here we are going to introduce another useful idea - the &lt;strong&gt;generative
model&lt;&#x2F;strong&gt;. The &lt;strong&gt;generative model&lt;&#x2F;strong&gt; basically describes the procedure to
get a sample value of a random variable. In this particular case, the
generative model of $x_{t+1}$ consits of:&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;Sample $a \cdot x_t$ out of the distribution $N(a\hat{x}_t, a^2\sigma_t^2)$ (Note that this is the conclusion from the previous section)&lt;&#x2F;li&gt;
&lt;li&gt;Sample $e_t$ out of the distribution $N(0, \sigma_{e_t}^2)$&lt;&#x2F;li&gt;
&lt;li&gt;Construct $x_{t+1}$ by adding the two sampled values up&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;p&gt;So you can see that the &lt;strong&gt;generative model&lt;&#x2F;strong&gt; is basically an
interpretation of the problem formulation, provides no new knowledge
at all. However, with such interpretation it is clear to see that
$x_{t+1}$ as a random variable is basically the sum of two
&lt;strong&gt;independently distributed&lt;&#x2F;strong&gt; gaussian random variables!&lt;&#x2F;p&gt;
&lt;p&gt;I will cheat here by referring to the &lt;a rel=&quot;noopener&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Sum_of_normally_distributed_random_variables#:~:text=Independent%20random%20variables,-Let%20X%20and&amp;amp;text=This%20means%20that%20the%20sum,squares%20of%20the%20standard%20deviations&quot;&gt;generating function based
proof&lt;&#x2F;a&gt;
from wikipedia. As you have probably guessed, the conclusion is that&lt;&#x2F;p&gt;
&lt;p&gt;$$
\textrm{if i.i.d } \begin{cases}
X &amp;amp;\sim N(\mu_X, \sigma_X^2) \\
Y &amp;amp;\sim N(\mu_Y, \sigma_Y^2)
\end{cases} \textrm{ then } Z = X + Y \sim N(\mu_X + \mu_Y, \sigma_X^2 + \sigma_Y^2)
$$&lt;&#x2F;p&gt;
&lt;p&gt;By plugging in our generative model, we obtain&lt;&#x2F;p&gt;
&lt;p&gt;$$
x_{t+1} \sim N(a\hat{x}_t, a^2\sigma_t^2 + \sigma_{e_t}^2)
$$&lt;&#x2F;p&gt;
&lt;p&gt;which means such &lt;strong&gt;good prediction&lt;&#x2F;strong&gt; would be&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
\hat{x}_{t+1} &amp;amp;=&amp;amp; a\hat{x}_t \\
\sigma_{t+1} &amp;amp;=&amp;amp; a^2 \sigma_t^2 + \sigma_{e_t}^2
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;p&gt;Yep, just add the variance of the error to the estimation of variance. Pretty simple, right?&lt;&#x2F;p&gt;
&lt;h2 id=&quot;let-there-be-observations&quot;&gt;Let There Be Observations&lt;&#x2F;h2&gt;
&lt;p&gt;So we know how to predict $x_{t+1}$ given $x_t$, which is great. This
means that if we happen to know the initial state of the system,
$x_0$, we can start to predict $x_1$, and then $x_2$, ..., till any
$x_t$, which will be&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
\hat{x}_t &amp;amp;=&amp;amp; a^t\hat{x}_t \\
\sigma_t &amp;amp;=&amp;amp; a \cdot (a \cdot ( a \cdots) + \sigma_{e_{t-2}}^2) + \sigma_{e_{t-1}}^2
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;p&gt;As we can see, there is one fatal problem in the above prediction. As
$t$ grows, our estimation will be less precise because the variance is
going to grow very quickly. This is because ecah time we make
prediction for one more step, the variance of error will be added to
it. To see it more clearly, when $a=1$, we will have&lt;&#x2F;p&gt;
&lt;p&gt;$$
\sigma_t = \sigma_{e_0}^2 + \sigma_{e_1}^2 + \cdots + \sigma_{e_{t-1}}^2
$$&lt;&#x2F;p&gt;
&lt;p&gt;It is easy to understand this &lt;strong&gt;error accumulation&lt;&#x2F;strong&gt; intuitively. As
we make more predictions, we are not getting new information about the
system. Think about it - if you only know what a cat looks like when
it is 1-week old, how can you precisely predict how it looks like when
he is 3 years old? If you only know your weight before COVID-19
keeping us at home, how do you precisely estimate your current weight?
The key here is that you need &lt;strong&gt;constant&lt;&#x2F;strong&gt; feedback to guide your
estimation when it gets&lt;&#x2F;p&gt;
&lt;p&gt;Okay let&#x27;s take one more step in the problem formulation, so that it
will be &lt;strong&gt;more realistic&lt;&#x2F;strong&gt;. Now we are allowed to take measurement of
the system via an &lt;strong&gt;observation function&lt;&#x2F;strong&gt;. In the weight example,
this translates to you are allowed to weigh yourself with an electric
scale every now and then. It seems that with the method to take
measurement, we do not need to estimate the state anymore, we can
simply observe the readings and get the precise value! Except for that
in reality the measurement is usually not always accurate. Therefore,
the &lt;strong&gt;observation function&lt;&#x2F;strong&gt; has an associated &lt;strong&gt;error&lt;&#x2F;strong&gt; as well.
Assuming a &lt;strong&gt;linear&lt;&#x2F;strong&gt; observation function, we will have the readings
mathematically as:&lt;&#x2F;p&gt;
&lt;p&gt;$$
y_t = h_t \cdot x_t + r_t, \textrm{ where } r_t \sim N \left(0, \sigma_{r_t} \right)
$$&lt;&#x2F;p&gt;
&lt;p&gt;Note that when you take measurement at time $t + 1$, you can directly
observe $y_{t+1}$ (though $y_{t+1}$ is not $x_{t+1}$, and the latter
is what we want to estimate). The question now becomes: how to make a
good estimation about $x_{t+1}$, given&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;A good estiamtion of the previous state $x_t$, and&lt;&#x2F;li&gt;
&lt;li&gt;The current measurement reading $y_t$&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;Let&#x27;s first find the &lt;strong&gt;generative model&lt;&#x2F;strong&gt; interpretation of this. We
can see that $y_{t+1}$ is generated in the following 3 steps:&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;Sample $x_{t+1} = a \cdot x_t + e_t$ out of the distribution $N(a\hat{x}_t, a^2\sigma_t^2 + \sigma_{e_t}^2)$ (Note that this is the conclusion from the previous section)&lt;&#x2F;li&gt;
&lt;li&gt;Sample $r_{t+1}$ out of the distribution $N(0, \sigma_{r_{t+1}}^2)$&lt;&#x2F;li&gt;
&lt;li&gt;Directly compute $y_{t+1} = h_{t+1} \cdot x_{t+1} + r_{t+1}$&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;p&gt;Let&#x27;s stop for a while to take a closer look at the above generative
model. The distribution $N(a\hat{x}_t, a^2\sigma_t^2 + \sigma_{e_t}^2)$ comes from the conclusion of the previous section,
which represents the best&lt;sup class=&quot;footnote-reference&quot;&gt;&lt;a href=&quot;#2&quot;&gt;2&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt; estimation of $x_{t+1}$ we can get based
&lt;strong&gt;pure prediction&lt;&#x2F;strong&gt;. The mean and variance of this distribution will
be used quite a lot in the derivation below, so it is good to give it
some name. Let&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
x&#x27;_{t+1} &amp;amp;=&amp;amp; a\hat{x}_t \\
\sigma&#x27;^2_{t+1} &amp;amp;=&amp;amp; a^2 \sigma_t^2 + \sigma_{e_t}^2
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;p&gt;Note that both $x&#x27;_{t+1}$ and $sigma&#x27;_{t+1}$ are determinsitic
values, i.e. neither of them is random variable.&lt;&#x2F;p&gt;
&lt;div class=&quot;footnote-definition&quot; id=&quot;2&quot;&gt;&lt;sup class=&quot;footnote-definition-label&quot;&gt;2&lt;&#x2F;sup&gt;
&lt;p&gt;I am being informal here as we haven&#x27;t formally defined what
&lt;strong&gt;the best&lt;&#x2F;strong&gt; estimation means. We will likely get to this topic in
the future, so bear with me for now.&lt;&#x2F;p&gt;
&lt;&#x2F;div&gt;
&lt;p&gt;Although the above generative model is about generating $y_{t+1}$, it
is $x_{t+1}$ that we actually want to estimate. We can do this by
deriving the &lt;strong&gt;pdf&lt;&#x2F;strong&gt; of $x_{t+1}$. The following derivation will
likely seem very tedious, but I will try to be clear on each step and
trust me, this will be the last challenge in this post!&lt;&#x2F;p&gt;
&lt;p&gt;Given that we have observed $y_{t+1} = y$, what is the probability of
$x_{t+1} = x$? Such probability can be written as&lt;&#x2F;p&gt;
&lt;p&gt;$$
\forall x, \mathbb{P} [ x_{t+1} = x \mid y_{t+1} = y] = f_{x_{t+1}}(x) \mathrm{d}x
$$&lt;&#x2F;p&gt;
&lt;p&gt;where $f_{x_{t+1}}(x)$ is the unknown (yet) &lt;strong&gt;pdf&lt;&#x2F;strong&gt; of $x_{t+1}$ that
we want to derive. Also note that there is $\forall x$ in the
statement, which is &lt;strong&gt;very important&lt;&#x2F;strong&gt;. It means that the equation
holds for every single $x$.&lt;&#x2F;p&gt;
&lt;p&gt;By applying Bayes&#x27;s law, the left hand side can also be transformed as&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{darray}{rcl}
\forall x,  \mathbb{P} [ x_{t+1} = x \mid y_{t+1} = y] &amp;amp;=&amp;amp; \frac{\mathbb{P}[y_{t+1}=y \mid x_{t+1} = x] \mathbb{P}[x_{t+1} = x]}{\mathbb{P}[y_{t+1} = y]} \\
&amp;amp;=&amp;amp; \frac{\mathbb{P}[r_{t+1} = y - h_{t+1}x] \mathbb{P}[x_{t+1} = x]}{\mathbb{P}[y_{t+1} = y]}
\end{darray}
$$&lt;&#x2F;p&gt;
&lt;p&gt;So there are 3 items on the right hand side. Let&#x27;s crack them one by
one.&lt;&#x2F;p&gt;
&lt;p&gt;The simplest one here is $\mathbb{P}[y_{t+1} = y]$. Since it does not
depend on $x$, this can be just written as&lt;&#x2F;p&gt;
&lt;p&gt;$$
\mathbb{P}[y_{t+1} = y] = \mathrm{Const} \cdot dy
$$&lt;&#x2F;p&gt;
&lt;p&gt;Next comes $\mathbb{P}[x_{t+1} = x]$, without conditioning on the
value of $y_{t+1}$. This is the &lt;strong&gt;pure prediction&lt;&#x2F;strong&gt; we discussed
above, which $ \sim N(x&#x27;_{t+1}, \sigma&#x27;^2_{t+1})$. Therefore it is
simply&lt;&#x2F;p&gt;
&lt;p&gt;$$
\mathbb{P}[x_{t+1} = x] = \mathrm{Const} \cdot \exp\left(-\frac{(x-x&#x27;_{t+1})^2}{2\sigma&#x27;^2_{t+1}}\right) \mathrm{d} x
$$&lt;&#x2F;p&gt;
&lt;p&gt;The last one $\mathbb{P}[r_{t+1} = y - h_{t+1}x]$ is about $r_{t+1}$,
which happens to be following a Gaussian distirbution as well (even
better, the mean is zero)! This means&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{darray}{rcl}
\mathbb{P}[r_{t+1} = y - h_{t+1}x] &amp;amp;=&amp;amp; \mathrm{Const} \cdot \exp \left( -\frac{(y - h_{t+1}x)^2}{2\sigma^2_{r_{t+1}}}  \right) \mathrm{d}r \\
&amp;amp;=&amp;amp; \mathrm{Const} \cdot \exp \left( -\frac{(y - h_{t+1}x)^2}{2\sigma^2_{r_{t+1}}}  \right) (\mathrm{d}y - h_{t+1}\mathrm{d}x)
\end{darray}
$$&lt;&#x2F;p&gt;
&lt;p&gt;Note $\mathrm{d}r$ can be written as the form above because of
differential form arithmetics. It is good to understand the rules
behind them, but when you get familiar with the rules, they are just
no more strange than the rules you use to take derivatives.&lt;&#x2F;p&gt;
&lt;p&gt;Therefore, take the above 3 expanded components and plug them back,
and keep in mind that by differential form rule $\mathrm{d}x \wedge
\mathrm{d}x = 0$, we have&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{darray}{rcl}
\forall x,  &amp;amp;&amp;amp; f_{x_{t+1}}(x) \mathrm{d}x \\
&amp;amp;=&amp;amp; \mathbb{P} [ x_{t+1} = x \mid y_{t+1} = y] \\
&amp;amp;=&amp;amp; \frac{\mathrm{Const} \cdot
\exp \left(
-\frac{(x-x&#x27;_{t+1})^2}{2\sigma&#x27;^2_{t+1}} -\frac{(y - h_{t+1}x)^2}{2\sigma^2_{r_{t+1}}}
\right) (\mathrm{d}x \wedge \mathrm{d} y - h_{t+1} \mathrm{d}x \wedge \mathrm{d}x)}
{\mathrm{Const} \cdot dy} \\
&amp;amp;=&amp;amp; \mathrm{Const} \cdot \exp \left(
-\frac{(x-x&#x27;_{t+1})^2}{2\sigma&#x27;^2_{t+1}} -\frac{(y - h_{t+1}x)^2}{2\sigma^2_{r_{t+1}}}
\right) \mathrm{d} x
\end{darray}
$$&lt;&#x2F;p&gt;
&lt;p&gt;Let&#x27;s then take a closer look at the terms inside $\exp()$&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{darray}{rcl}
&amp;amp;&amp;amp;-\frac{(x-x&#x27;_{t+1})^2}{2\sigma&#x27;^2_{t+1}} -\frac{(y - h_{t+1}x)^2}{2\sigma^2_{r_{t+1}}} \\
&amp;amp;=&amp;amp;
-\frac{(\sigma^2_{r_{t+1}} + h^2_{t+1}\sigma&#x27;^2_{t+1})x^2 -
2(\sigma^2_{r_{t+1}} x&#x27;_{t+1} + \sigma&#x27;^2_{t+1}h_{t+1}y)x + \mathrm{Const}}
{2\sigma&#x27;^2_{t+1}\sigma^2_{r_{t+1}}} \\
&amp;amp;=&amp;amp; -\frac{1}{2} \frac{x^2 -
2\frac{\sigma^2_{r_{t+1}} x&#x27;_{t+1} + \sigma&#x27;^2_{t+1}h_{t+1}y}{\sigma^2_{r_{t+1}} + h^2_{t+1}\sigma&#x27;^2_{t+1}}x}
{\frac{\sigma&#x27;^2_{t+1}\sigma^2_{r_{t+1}}}{\sigma^2_{r_{t+1}} + h^2_{t+1}\sigma&#x27;^2_{t+1}}} + \mathrm{Const}  \\
&amp;amp;=&amp;amp; - \frac{1}{2}\frac{\left(x - \frac{\sigma^2_{r_{t+1}} x&#x27;_{t+1} + \sigma&#x27;^2_{t+1}h_{t+1}y}{\sigma^2_{r_{t+1}} + h^2_{t+1}\sigma&#x27;^2_{t+1}} \right)^2}
{\frac{\sigma&#x27;^2_{t+1}\sigma^2_{r_{t+1}}}{\sigma^2_{r_{t+1}} + h^2_{t+1}\sigma&#x27;^2_{t+1}}} + \mathrm{Const}
\end{darray}
$$&lt;&#x2F;p&gt;
&lt;p&gt;Plug this back in the above equation we have&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{darray}{rcl}
\forall x,  f_{x_{t+1}}(x) \mathrm{d}x &amp;amp;=&amp;amp; \mathbb{P} [ x_{t+1} = x \mid y_{t+1} = y] \\
&amp;amp;=&amp;amp; \mathrm{Const} \cdot \exp \left(  - \frac{1}{2}\frac{\left(x - \frac{\sigma^2_{r_{t+1}} x&#x27;_{t+1} + \sigma&#x27;^2_{t+1}h_{t+1}y}{\sigma^2_{r_{t+1}} + h^2_{t+1}\sigma&#x27;^2_{t+1}} \right)^2}
{\frac{\sigma&#x27;^2_{t+1}\sigma^2_{r_{t+1}}}{\sigma^2_{r_{t+1}} + h^2_{t+1}\sigma&#x27;^2_{t+1}}} \right) \mathrm{d} x
\end{darray}
$$&lt;&#x2F;p&gt;
&lt;p&gt;Let&#x27;s remove $\mathrm{d}x$ from both side, and we have&lt;&#x2F;p&gt;
&lt;p&gt;$$
\forall x,  f_{x_{t+1}}(x) =
= \mathrm{Const} \cdot \exp \left(  - \frac{1}{2} \frac{\left(x - \frac{\sigma^2_{r_{t+1}} x&#x27;_{t+1} + \sigma&#x27;^2_{t+1}h_{t+1}y_{t+1}}{\sigma^2_{r_{t+1}} + h^2_{t+1}\sigma&#x27;^2_{t+1}} \right)^2}
{\frac{\sigma&#x27;^2_{t+1}\sigma^2_{r_{t+1}}}{\sigma^2_{r_{t+1}} + h^2_{t+1}\sigma&#x27;^2_{t+1}}} \right)
$$&lt;&#x2F;p&gt;
&lt;p&gt;Note that since $y$ is basically the value of $y_{t+1}$, it is
replaced with $y_{t+1}$.&lt;&#x2F;p&gt;
&lt;p&gt;This means that $x_{t+1}$ follows Gaussian distirbution! We can even
directly tell what is the mean and what is the variance of the
estimation from the above formula, i.e.&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
\hat{x}_{t+1} &amp;amp;=&amp;amp; \frac{\sigma^2_{r_{t+1}} x&#x27;_{t+1} + \sigma&#x27;^2_{t+1}h_{t+1}y_{t+1}}{\sigma^2_{r_{t+1}} + h^2_{t+1}\sigma&#x27;^2_{t+1}} \\
\sigma^2_{t+1} &amp;amp;=&amp;amp; \frac{\sigma&#x27;^2_{t+1}\sigma^2_{r_{t+1}}}{\sigma^2_{r_{t+1}} + h^2_{t+1}\sigma&#x27;^2_{t+1}}
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;h2 id=&quot;making-sense-of-the-result&quot;&gt;Making Sense of the Result&lt;&#x2F;h2&gt;
&lt;p&gt;The above answer still looks very complicated, and let me try to
interpret it in a more intuitive way in this section.&lt;&#x2F;p&gt;
&lt;p&gt;Let&#x27;s start with the mean&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{darray}{rcl}
\hat{x}_{t+1} &amp;amp;=&amp;amp; \frac{\sigma^2_{r_{t+1}} x&#x27;_{t+1} + \sigma&#x27;^2_{t+1}h_{t+1}y_{t+1}}{\sigma^2_{r_{t+1}} + h^2_{t+1}\sigma&#x27;^2_{t+1}} \\
&amp;amp;=&amp;amp; \frac{\sigma^2_{r_{t+1}}}{\sigma^2_{r_{t+1}} + h^2_{t+1}\sigma&#x27;^2_{t+1}} \cdot x&#x27;_{t+1} +
\frac{\sigma&#x27;^2_{t+1}h_{t+1}}{\sigma^2_{r_{t+1}} + h^2_{t+1}\sigma&#x27;^2_{t+1}} \cdot y_{t+1} \\
&amp;amp;=&amp;amp; \frac{\sigma^2_{r_{t+1}}}{\sigma^2_{r_{t+1}} + h^2_{t+1}\sigma&#x27;^2_{t+1}} \cdot x&#x27;_{t+1} +
\frac{\sigma&#x27;^2_{t+1}h^2_{t+1}}{\sigma^2_{r_{t+1}} + h^2_{t+1}\sigma&#x27;^2_{t+1}} \cdot
\frac{y_{t+1}}{h_{t+1}} \\
&amp;amp;=&amp;amp; K \cdot x&#x27;_{t+1} + (1 - K) \cdot \frac{y_{t+1}}{h_{t+1}}
\end{darray}
$$&lt;&#x2F;p&gt;
&lt;p&gt;Note that in the above formula, we let&lt;&#x2F;p&gt;
&lt;p&gt;$$
K = \frac{\sigma^2_{r_{t+1}}}{\sigma^2_{r_{t+1}} + h^2_{t+1}\sigma&#x27;^2_{t+1}}
$$&lt;&#x2F;p&gt;
&lt;p&gt;$K$ is clearly a number between $0$ and $1$. This means that the mean
of estimation of $x_{t+1}$ is actually a &lt;b&gt;weighted combination&lt;&#x2F;b&gt;
of $x&#x27;_{t+1}$ and $y_{t+1} &#x2F; h_{t+1}$. It is worth noting that&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;$x&#x27;_{t+1}$ is the best guess you can have based on &lt;b&gt;pure prediction&lt;&#x2F;b&gt;&lt;&#x2F;li&gt;
&lt;li&gt;$y_{t+1} &#x2F; h_{t+1}$ is the best guess you can have based on &lt;b&gt;pure observation&lt;&#x2F;b&gt;&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;p&gt;So this is basically about trusting both of those two evidences with a
grain of salt. And how much you trust each of them depends on the
variance of each guess. The bigger variance, the less trustworthy.
Very reasonable, right?&lt;&#x2F;p&gt;
&lt;p&gt;What about the estimated variance of $x_{t+1}$? As we have defined
$K$, it can be written as&lt;&#x2F;p&gt;
&lt;p&gt;$$
\sigma^2_{t+1} = K \cdot \sigma&#x27;^2_{t+1}
$$&lt;&#x2F;p&gt;
&lt;p&gt;This is intuitively just updating the pure prediction based variance
estimation as we have observation now. Note that becauese $K &amp;lt; 1$, the
final estimated variance is going to be smaller than the actual pure
prediction based estimated variance!&lt;&#x2F;p&gt;
&lt;p&gt;So at this moment we can now summarize the procedure of Kalman Filter
update, i.e. how to obtain $t+1$-step estimation based on $t$-step
estimation and new observation.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;b&gt;Step I:&lt;&#x2F;b&gt; Compute the pure prediction based estimation.&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
x&#x27;_{t+1} &amp;amp;=&amp;amp; a\hat{x}_t \\
\sigma&#x27;^2_{t+1} &amp;amp;=&amp;amp; a^2 \sigma_t^2 + \sigma_{e_t}^2
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;p&gt;&lt;b&gt;Step II:&lt;&#x2F;b&gt; Compute the combination weight $K$, which is often called the &lt;b&gt;Kalman Gain&lt;&#x2F;b&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;$$
K = \frac{\sigma^2_{r_{t+1}}}{\sigma^2_{r_{t+1}} + h^2_{t+1}\sigma&#x27;^2_{t+1}}
$$&lt;&#x2F;p&gt;
&lt;p&gt;&lt;b&gt;Step III:&lt;&#x2F;b&gt; Use Kalman Gain $K$ and observation $y_{t+1}$ to
update the pure prediction based estimation.&lt;&#x2F;p&gt;
&lt;p&gt;$$
\begin{cases}
\hat{x}_{t+1} &amp;amp;=&amp;amp; K \cdot x&#x27;_{t+1} + (1 - K) \cdot \frac{y_{t+1}}{h_{t+1}} \\
\sigma^2_{t+1} &amp;amp;=&amp;amp; K \cdot \sigma&#x27;^2_{t+1}
\end{cases}
$$&lt;&#x2F;p&gt;
&lt;h2 id=&quot;summarry&quot;&gt;Summarry&lt;&#x2F;h2&gt;
&lt;p&gt;This post demonstrated the derivation of 1D Kalman Filter, and also
slightly touched the intuitive interpretation of it. Also, I think
many of the techniques used here such as generative model and
differential forms can find their applications in many other
situations.&lt;&#x2F;p&gt;
&lt;p&gt;However, in reality, 1D Kalman Filter is rarely useful enough. This
post should have prepared you for the next journey - multivariate
Kalman Filter. Stay tuned!&lt;&#x2F;p&gt;
</content>
        
    </entry>
    <entry xml:lang="en">
        <title>Declarative Docekr Container Service in NixOS</title>
        <published>2020-05-24T00:00:00+00:00</published>
        <updated>2020-05-24T00:00:00+00:00</updated>
        
        <author>
          <name>
            
              Unknown
            
          </name>
        </author>
        
        <link rel="alternate" type="text/html" href="https://www.breakds.org/declarative-docker-in-nixos/"/>
        <id>https://www.breakds.org/declarative-docker-in-nixos/</id>
        
        <content type="html" xml:base="https://www.breakds.org/declarative-docker-in-nixos/">&lt;h2 id=&quot;important-update-2020-05-24&quot;&gt;Important Update 2020.05.24&lt;&#x2F;h2&gt;
&lt;p&gt;After upgrading to 20.03 version of NixOS, the docker container starts
to use the container&#x27;s actual name instead of its systemd service&#x27;s
name to address the container. This means that to specify the database
container from the filerun web server&#x27;s container, you need to change
the value of &lt;code&gt;FR_DB_HOST&lt;&#x2F;code&gt; from &lt;code&gt;docker-filerun-mariadb.service&lt;&#x2F;code&gt; to
&lt;code&gt;filerun-mariadb&lt;&#x2F;code&gt;.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;the-problem&quot;&gt;The Problem&lt;&#x2F;h2&gt;
&lt;p&gt;One of the biggest convenience you have in NixOS is that many of the
services you want to run are already coded as a &quot;service&quot;. This means
that you can easily spin up a service like openssh with&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;nix&quot; class=&quot;language-nix z-code&quot;&gt;&lt;code class=&quot;language-nix&quot; data-lang=&quot;nix&quot;&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;&lt;span class=&quot;z-variable z-parameter z-name z-nix&quot;&gt;services&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator z-nix&quot;&gt;.&lt;&#x2F;span&gt;&lt;span class=&quot;z-variable z-parameter z-name z-nix&quot;&gt;openssh&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator z-nix&quot;&gt;.&lt;&#x2F;span&gt;&lt;span class=&quot;z-variable z-parameter z-name z-nix&quot;&gt;enable&lt;&#x2F;span&gt; &lt;span class=&quot;z-invalid z-illegal&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-constant z-language z-nix&quot;&gt;true&lt;&#x2F;span&gt;&lt;span class=&quot;z-invalid z-illegal&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;In fact, you can find a whole lot of such predefined services with
&lt;code&gt;services.&lt;&#x2F;code&gt; prefix in the &lt;a rel=&quot;noopener&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;nixos.org&#x2F;nixos&#x2F;options.html#services.&quot;&gt;NixOS
Options&lt;&#x2F;a&gt; site.&lt;&#x2F;p&gt;
&lt;p&gt;I also run &lt;a rel=&quot;noopener&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;www.filerun.com&#x2F;&quot;&gt;FileRun&lt;&#x2F;a&gt; as my NAS server
(similar to &lt;a rel=&quot;noopener&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;nextcloud.com&#x2F;&quot;&gt;NextCloud&lt;&#x2F;a&gt; but I found FileRun to
be more user friendly and hassle-free)&lt;sup class=&quot;footnote-reference&quot;&gt;&lt;a href=&quot;#1&quot;&gt;1&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt;. The official &lt;a rel=&quot;noopener&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;docs.filerun.com&#x2F;docker&quot;&gt;setup
guide&lt;&#x2F;a&gt; illustrated how to use &lt;a rel=&quot;noopener&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;docs.docker.com&#x2F;compose&#x2F;&quot;&gt;Docker
Compose&lt;&#x2F;a&gt; to run the service. I found
it ok to run the services with docker containers, but having to use
&lt;code&gt;docker-compose&lt;&#x2F;code&gt; to manage the containers make it &lt;strong&gt;less consistent&lt;&#x2F;strong&gt;
and &lt;strong&gt;less automatic&lt;&#x2F;strong&gt; comparing with my other services.&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;Since the service is not managed in the NixOS configuration, I have
to manually bring it up and down with &lt;code&gt;docker-compose&lt;&#x2F;code&gt;.&lt;&#x2F;li&gt;
&lt;li&gt;All the other services are managed automatically, and the
declarative configuration makes them easier to manage. I want my
FileRun instance to enjoy that as well.&lt;&#x2F;li&gt;
&lt;li&gt;In the future I might want to have more container-based services.
Experimenting with nix-native docker container-based services can
be helpful for that purpose.&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;p&gt;Therefore, I decided to write a nix service to replace the
&lt;code&gt;docker-compose&lt;&#x2F;code&gt; based solution, which is then documentated in this
post.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;the-original-docker-compose&quot;&gt;The Original Docker-Compose&lt;&#x2F;h2&gt;
&lt;p&gt;The docker compose (slightly adapted from the online doc provided by
FileRun) looks like below:&lt;&#x2F;p&gt;
&lt;pre class=&quot;z-code&quot;&gt;&lt;code&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;version: &amp;#39;2&amp;#39;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;services:
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;  db:
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;    image: mariadb:10.1
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;    environment:
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;      MYSQL_ROOT_PASSWORD: filerunpasswd
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;      MYSQL_USER: filerun
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;      MYSQL_PASSWORD: filerunpasswd
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;      MYSQL_DATABASE: filerundb
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;    volumes:
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;      - &#x2F;home&#x2F;delegator&#x2F;filerun&#x2F;db:&#x2F;var&#x2F;lib&#x2F;mysql
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;  web:
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;    image: afian&#x2F;filerun
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;    environment:
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;      FR_DB_HOST: db
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;      FR_DB_PORT: 3306
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;      FR_DB_NAME: filerundb
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;      FR_DB_USER: filerun
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;      FR_DB_PASS: filerunpasswd
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;      APACHE_RUN_USER: delegator
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;      APACHE_RUN_USER_ID: 600
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;      APACHE_RUN_GROUP: delegator
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;      APACHE_RUN_GROUP_ID: 600
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;    depends_on:
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;      - db
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;    links:
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;      - db:db
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;    ports:
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;      - &amp;quot;6000:80&amp;quot;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;    volumes:
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;      - &#x2F;home&#x2F;delegator&#x2F;filerun&#x2F;web:&#x2F;var&#x2F;www&#x2F;html
&lt;&#x2F;span&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;      - &#x2F;home&#x2F;delegator&#x2F;filerun&#x2F;user-files:&#x2F;user-files
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;It basically defines 2 docker containers, one for the databse and one
for the FileRun web server itself, which is based on PHP and Apache. I
know little about both technologies (part of the reason why I left
them managed by docker containers with official images).&lt;&#x2F;p&gt;
&lt;p&gt;One thing that worths emphasizing is that in order to setup the
communication between those two containers, a &lt;strong&gt;link&lt;&#x2F;strong&gt; is configured
for the web server container.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;the-database-container&quot;&gt;The Database Container&lt;&#x2F;h2&gt;
&lt;p&gt;With the new &lt;code&gt;docker-containers&lt;&#x2F;code&gt; option in NixOS configuration, bring
up the MariaDB docker container is as simple as&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;nix&quot; class=&quot;language-nix z-code&quot;&gt;&lt;code class=&quot;language-nix&quot; data-lang=&quot;nix&quot;&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;&lt;span class=&quot;z-variable z-parameter z-name z-nix&quot;&gt;docker-containers&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator z-nix&quot;&gt;.&lt;&#x2F;span&gt;&lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;filerun-mariadb&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-invalid z-illegal&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-punctuation z-definition z-attrset-or-function z-nix&quot;&gt;{&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;  &lt;span class=&quot;z-entity z-other z-attribute-name z-multipart z-nix&quot;&gt;image&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;mariadb:10.1&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;  &lt;span class=&quot;z-entity z-other z-attribute-name z-multipart z-nix&quot;&gt;environment&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-punctuation z-definition z-attrset-or-function z-nix&quot;&gt;{&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;    &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;MYSQL_ROOT_PASSWORD&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;randompasswd&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;    &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;MYSQL_USER&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;filerun&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;    &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;MYSQL_PASSWORD&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;randompasswd&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;    &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;MYSQL_DATABASE&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;filerundb&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;  &lt;span class=&quot;z-punctuation z-definition z-attrset z-nix&quot;&gt;}&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;  &lt;span class=&quot;z-entity z-other z-attribute-name z-multipart z-nix&quot;&gt;volumes&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-punctuation z-definition z-list z-nix&quot;&gt;[&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&#x2F;home&#x2F;delegator&#x2F;filerun&#x2F;db:&#x2F;var&#x2F;lib&#x2F;mysql&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-punctuation z-definition z-list z-nix&quot;&gt;]&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-attrset z-nix&quot;&gt;}&lt;&#x2F;span&gt;&lt;span class=&quot;z-invalid z-illegal&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;This is basically a direct translation of the first half in the
previous docker-compose file. Nothing intresting yet.&lt;&#x2F;p&gt;
&lt;p&gt;To verify that it actually works, let&#x27;s run &lt;code&gt;docker ps&lt;&#x2F;code&gt;, and it will
show the container with name &lt;code&gt;docker-filerun-mariadb.service&lt;&#x2F;code&gt; (note
the naming convention). We can get into the docker container with&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;bash&quot; class=&quot;language-bash z-code&quot;&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;z-source z-shell z-bash&quot;&gt;&lt;span class=&quot;z-meta z-function-call z-shell&quot;&gt;&lt;span class=&quot;z-variable z-function z-shell&quot;&gt;$&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-arguments z-shell&quot;&gt; docker exec&lt;span class=&quot;z-variable z-parameter z-option z-shell&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-parameter z-shell&quot;&gt; -&lt;&#x2F;span&gt;it&lt;&#x2F;span&gt; docker-filerun-mariadb.service &#x2F;bin&#x2F;bash&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;And once you are in the docker, the command&lt;&#x2F;p&gt;
&lt;pre class=&quot;z-code&quot;&gt;&lt;code&gt;&lt;span class=&quot;z-text z-plain&quot;&gt;mysql -u filerun -prandompasswd filerundb
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;should get you connected to the database.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;setting-up-the-bridge-networks&quot;&gt;Setting up the Bridge Networks&lt;&#x2F;h2&gt;
&lt;p&gt;By reading the documentation on &lt;a rel=&quot;noopener&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;docs.docker.com&#x2F;network&#x2F;bridge&#x2F;&quot;&gt;docker
network&lt;&#x2F;a&gt;, it becomes clear to
me that I need to create an user-defined bridge network to put the two
docker containers in it, so that they can communicate with each other.
This is to replicate the behavior &quot;link&quot; in the docker compose setup.&lt;&#x2F;p&gt;
&lt;p&gt;Bridge network can be created with the command &lt;code&gt;docker network create&lt;&#x2F;code&gt;. In order to ensure that such bridge network is up, I am using
a trick that I learned from &lt;a rel=&quot;noopener&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;kj.orbekk.com&#x2F;&quot;&gt;KJ&lt;&#x2F;a&gt; - write a
oneshot systemd service do that.&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;nix&quot; class=&quot;language-nix z-code&quot;&gt;&lt;code class=&quot;language-nix&quot; data-lang=&quot;nix&quot;&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;&lt;span class=&quot;z-variable z-parameter z-name z-nix&quot;&gt;systemd&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator z-nix&quot;&gt;.&lt;&#x2F;span&gt;&lt;span class=&quot;z-variable z-parameter z-name z-nix&quot;&gt;services&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator z-nix&quot;&gt;.&lt;&#x2F;span&gt;&lt;span class=&quot;z-variable z-parameter z-name z-nix&quot;&gt;init-filerun-network-and-files&lt;&#x2F;span&gt; &lt;span class=&quot;z-invalid z-illegal&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-punctuation z-definition z-attrset-or-function z-nix&quot;&gt;{&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;  &lt;span class=&quot;z-entity z-other z-attribute-name z-multipart z-nix&quot;&gt;description&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;Create the network bridge filerun-br for filerun.&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;  &lt;span class=&quot;z-entity z-other z-attribute-name z-multipart z-nix&quot;&gt;after&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-punctuation z-definition z-list z-nix&quot;&gt;[&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;network.target&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-punctuation z-definition z-list z-nix&quot;&gt;]&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;  &lt;span class=&quot;z-entity z-other z-attribute-name z-multipart z-nix&quot;&gt;wantedBy&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-punctuation z-definition z-list z-nix&quot;&gt;[&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;multi-user.target&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-punctuation z-definition z-list z-nix&quot;&gt;]&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;  
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;  &lt;span class=&quot;z-entity z-other z-attribute-name z-multipart z-nix&quot;&gt;serviceConfig&lt;&#x2F;span&gt;.&lt;span class=&quot;z-entity z-other z-attribute-name z-multipart z-nix&quot;&gt;Type&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;oneshot&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;   &lt;span class=&quot;z-entity z-other z-attribute-name z-multipart z-nix&quot;&gt;script&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-other z-nix&quot;&gt;let&lt;&#x2F;span&gt; &lt;span class=&quot;z-entity z-other z-attribute-name z-multipart z-nix&quot;&gt;dockercli&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;span class=&quot;z-markup z-italic&quot;&gt;&lt;span class=&quot;z-punctuation z-section z-embedded z-begin z-nix&quot;&gt;${&lt;&#x2F;span&gt;&lt;span class=&quot;z-variable z-parameter z-name z-nix&quot;&gt;config&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator z-nix&quot;&gt;.&lt;&#x2F;span&gt;&lt;span class=&quot;z-variable z-parameter z-name z-nix&quot;&gt;virtualisation&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator z-nix&quot;&gt;.&lt;&#x2F;span&gt;&lt;span class=&quot;z-variable z-parameter z-name z-nix&quot;&gt;docker&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator z-nix&quot;&gt;.&lt;&#x2F;span&gt;&lt;span class=&quot;z-variable z-parameter z-name z-nix&quot;&gt;package&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-section z-embedded z-end z-nix&quot;&gt;}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&#x2F;bin&#x2F;docker&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;           &lt;span class=&quot;z-keyword z-other z-nix&quot;&gt;in&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-other z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-other z-start z-nix&quot;&gt;&amp;#39;&amp;#39;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;&lt;span class=&quot;z-string z-quoted z-other z-nix&quot;&gt;             # Put a true at the end to prevent getting non-zero return code, which will
&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;&lt;span class=&quot;z-string z-quoted z-other z-nix&quot;&gt;             # crash the whole service.
&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;&lt;span class=&quot;z-string z-quoted z-other z-nix&quot;&gt;             check=$(&lt;span class=&quot;z-markup z-italic&quot;&gt;&lt;span class=&quot;z-punctuation z-section z-embedded z-begin z-nix&quot;&gt;${&lt;&#x2F;span&gt;&lt;span class=&quot;z-variable z-parameter z-name z-nix&quot;&gt;dockercli&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-section z-embedded z-end z-nix&quot;&gt;}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; network ls | grep &amp;quot;filerun-br&amp;quot; || true)
&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;&lt;span class=&quot;z-string z-quoted z-other z-nix&quot;&gt;             if [ -z &amp;quot;$check&amp;quot; ]; then
&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;&lt;span class=&quot;z-string z-quoted z-other z-nix&quot;&gt;               &lt;span class=&quot;z-markup z-italic&quot;&gt;&lt;span class=&quot;z-punctuation z-section z-embedded z-begin z-nix&quot;&gt;${&lt;&#x2F;span&gt;&lt;span class=&quot;z-variable z-parameter z-name z-nix&quot;&gt;dockercli&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-section z-embedded z-end z-nix&quot;&gt;}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; network create filerun-br
&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;&lt;span class=&quot;z-string z-quoted z-other z-nix&quot;&gt;             else
&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;&lt;span class=&quot;z-string z-quoted z-other z-nix&quot;&gt;               echo &amp;quot;filerun-br already exists in docker&amp;quot;
&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;&lt;span class=&quot;z-string z-quoted z-other z-nix&quot;&gt;             fi
&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;&lt;span class=&quot;z-string z-quoted z-other z-nix&quot;&gt;           &lt;span class=&quot;z-punctuation z-definition z-string z-other z-end z-nix&quot;&gt;&amp;#39;&amp;#39;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-attrset z-nix&quot;&gt;}&lt;&#x2F;span&gt;&lt;span class=&quot;z-invalid z-illegal&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;This makes sure that the network will always be there when it is
needed. To add the db into the bridge network, one extra line would
solve the problem (see the last line).&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;nix&quot; class=&quot;language-nix z-code&quot;&gt;&lt;code class=&quot;language-nix&quot; data-lang=&quot;nix&quot;&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;&lt;span class=&quot;z-variable z-parameter z-name z-nix&quot;&gt;docker-containers&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator z-nix&quot;&gt;.&lt;&#x2F;span&gt;&lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;filerun-mariadb&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-invalid z-illegal&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-punctuation z-definition z-attrset-or-function z-nix&quot;&gt;{&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;  &lt;span class=&quot;z-entity z-other z-attribute-name z-multipart z-nix&quot;&gt;image&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;mariadb:10.1&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;  &lt;span class=&quot;z-entity z-other z-attribute-name z-multipart z-nix&quot;&gt;environment&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-punctuation z-definition z-attrset-or-function z-nix&quot;&gt;{&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;    &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;MYSQL_ROOT_PASSWORD&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;randompasswd&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;    &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;MYSQL_USER&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;filerun&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;    &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;MYSQL_PASSWORD&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;randompasswd&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;    &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;MYSQL_DATABASE&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;filerundb&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;  &lt;span class=&quot;z-punctuation z-definition z-attrset z-nix&quot;&gt;}&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;  &lt;span class=&quot;z-entity z-other z-attribute-name z-multipart z-nix&quot;&gt;volumes&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-punctuation z-definition z-list z-nix&quot;&gt;[&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&#x2F;home&#x2F;delegator&#x2F;filerun&#x2F;db:&#x2F;var&#x2F;lib&#x2F;mysql&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-punctuation z-definition z-list z-nix&quot;&gt;]&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;  &lt;span class=&quot;z-entity z-other z-attribute-name z-multipart z-nix&quot;&gt;extraDockerOptions&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-punctuation z-definition z-list z-nix&quot;&gt;[&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;--network=filerun-br&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-punctuation z-definition z-list z-nix&quot;&gt;]&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-attrset z-nix&quot;&gt;}&lt;&#x2F;span&gt;&lt;span class=&quot;z-invalid z-illegal&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;h2 id=&quot;the-web-server-container&quot;&gt;The Web Server Container&lt;&#x2F;h2&gt;
&lt;p&gt;The web server then follows pretty much the same way as the Database
container.&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;nix&quot; class=&quot;language-nix z-code&quot;&gt;&lt;code class=&quot;language-nix&quot; data-lang=&quot;nix&quot;&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;&lt;span class=&quot;z-variable z-parameter z-name z-nix&quot;&gt;docker-containers&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator z-nix&quot;&gt;.&lt;&#x2F;span&gt;&lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;filerun&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-invalid z-illegal&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-punctuation z-definition z-attrset-or-function z-nix&quot;&gt;{&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;  &lt;span class=&quot;z-entity z-other z-attribute-name z-multipart z-nix&quot;&gt;image&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;afian&#x2F;filerun&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;  &lt;span class=&quot;z-entity z-other z-attribute-name z-multipart z-nix&quot;&gt;environment&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-punctuation z-definition z-attrset-or-function z-nix&quot;&gt;{&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;    &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;FR_DB_HOST&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;filerun-mariadb&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;  &lt;span class=&quot;z-comment z-line z-number-sign z-nix&quot;&gt;# !! IMPORTANT&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;    &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;FR_DB_PORT&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;3306&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;    &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;FR_DB_NAME&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;filerundb&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;    &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;FR_DB_USER&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;filerun&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;    &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;FR_DB_PASS&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;randompasswd&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;    &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;APACHE_RUN_USER&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;delegator&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;    &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;APACHE_RUN_USER_ID&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;600&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;    &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;APACHE_RUN_GROUP&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;delegator&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;    &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;APACHE_RUN_GROUP_ID&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;600&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;  &lt;span class=&quot;z-punctuation z-definition z-attrset z-nix&quot;&gt;}&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;  &lt;span class=&quot;z-entity z-other z-attribute-name z-multipart z-nix&quot;&gt;ports&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-punctuation z-definition z-list z-nix&quot;&gt;[&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;6000:80&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-punctuation z-definition z-list z-nix&quot;&gt;]&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;  &lt;span class=&quot;z-entity z-other z-attribute-name z-multipart z-nix&quot;&gt;volumes&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-punctuation z-definition z-list z-nix&quot;&gt;[&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;    &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&#x2F;home&#x2F;delegator&#x2F;filerun&#x2F;web:&#x2F;var&#x2F;www&#x2F;html&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;    &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&#x2F;home&#x2F;delegator&#x2F;filerun&#x2F;user-files:&#x2F;user-files&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;  &lt;span class=&quot;z-punctuation z-definition z-list z-nix&quot;&gt;]&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;  &lt;span class=&quot;z-entity z-other z-attribute-name z-multipart z-nix&quot;&gt;extraDockerOptions&lt;&#x2F;span&gt; &lt;span class=&quot;z-keyword z-operator z-bind z-nix&quot;&gt;=&lt;&#x2F;span&gt; &lt;span class=&quot;z-punctuation z-definition z-list z-nix&quot;&gt;[&lt;&#x2F;span&gt; &lt;span class=&quot;z-string z-quoted z-double z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-string z-double z-start z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;--network=filerun-br&lt;span class=&quot;z-punctuation z-definition z-string z-double z-end z-nix&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt; &lt;span class=&quot;z-punctuation z-definition z-list z-nix&quot;&gt;]&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-terminator z-bind z-nix&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;span class=&quot;z-source z-nix&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-attrset z-nix&quot;&gt;}&lt;&#x2F;span&gt;&lt;span class=&quot;z-invalid z-illegal&quot;&gt;;&lt;&#x2F;span&gt;
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;It is in the same bridge network. The most important line (marked
above) here is to set up the value for the environment variable
&lt;code&gt;&quot;FR_DB_HOST&quot;&lt;&#x2F;code&gt;. I did some experiment and found that within the same
bridge network, one container uses the other container&#x27;s name as the
hostname. Since NixOS&#x27;s &lt;code&gt;docker-containers&lt;&#x2F;code&gt; modules make the
convention of naming the container in such a way, I will just put the
other container&#x27;s name there &lt;sup class=&quot;footnote-reference&quot;&gt;&lt;a href=&quot;#2&quot;&gt;2&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;strong&gt;Important Notes&lt;&#x2F;strong&gt;: If you are using 19.09 or older version of NixOS,
the naming convention is actually different for docker containers.
Nothing more needs to be changed, just make sure your &lt;code&gt;FR_DB_HOST&lt;&#x2F;code&gt; is
set to &lt;code&gt;docker-filerun-mariadb.service&lt;&#x2F;code&gt; inated.&lt;&#x2F;p&gt;
&lt;p&gt;With those, everything should be up and running!&lt;&#x2F;p&gt;
&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;&#x2F;h2&gt;
&lt;p&gt;A more comprehensive service for FileRun as demonstrated in this
article can be found
&lt;a rel=&quot;noopener&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;git.breakds.org&#x2F;breakds&#x2F;nixvital&#x2F;src&#x2F;branch&#x2F;master&#x2F;modules&#x2F;services&#x2F;filerun.nix&quot;&gt;here&lt;&#x2F;a&gt;.
I omitted the details about how to add options and various
flexibilities to the service module in this article as those might be
distracting.&lt;&#x2F;p&gt;
&lt;p&gt;I found it to be very simple to spin up docker container based
services with the &lt;code&gt;docker-containers&lt;&#x2F;code&gt; module. Hope this can help you
as well.&lt;&#x2F;p&gt;
&lt;div class=&quot;footnote-definition&quot; id=&quot;1&quot;&gt;&lt;sup class=&quot;footnote-definition-label&quot;&gt;1&lt;&#x2F;sup&gt;
&lt;p&gt;You will need a license to actually self host filerun. See &lt;a rel=&quot;noopener&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;filerun.com&#x2F;pricing&quot;&gt;here&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;&#x2F;div&gt;
&lt;div class=&quot;footnote-definition&quot; id=&quot;2&quot;&gt;&lt;sup class=&quot;footnote-definition-label&quot;&gt;2&lt;&#x2F;sup&gt;
&lt;p&gt;It would be much better if I can directly read the container&#x27;s
name from &lt;code&gt;config.docker-containers.filerun-mariadb&lt;&#x2F;code&gt;, so that it
would still work even if the naming convention changes. I could
not find such interface in &lt;code&gt;docker-containers&lt;&#x2F;code&gt; module though.&lt;&#x2F;p&gt;
&lt;&#x2F;div&gt;
</content>
        
    </entry>
</feed>
