Blog | gracefullight.dev

CNN 007

April 13, 2026 · 6 min read

Gracefullight

Owner

Transfer Learning

Knowledge acquired while solving one task, can be used to solve related tasks
Similar to the way humans apply knowledge acquired from on task to solve a new but similar, related task.

Transfer Learning Benefits

Less training data required: Model trained using a large (similar) dataset can be used as a starting point for training on a smaller dataset.
Faster training: Traninig can converage faster, du the use to existing knowledge (weights) to start with rather than from scratch.
Better model generalization: Model is trained to identify features which can be applied to new contexts.

VGG-16

Approach	Description	Use Case	When to Use
Use Pre-trained Model	Use ImageNet pre-trained model without any additional training	Dogs & cats classification	When dataset distribution is similar to ImageNet with few samples
Train FC Layers Only	Use CONV layers for feature extraction, train FC layers only	Different class classification on similar domain	When dataset is similar to ImageNet but different classes with limited samples
Train Last CONV + FC Layers	Train last CONV layers (specialized features) and FC layers	Significantly different data distribution domain	When dataset differs greatly from ImageNet, different classes, and limited samples
Train All CONV + FC Layers	Train all CONV layers and FC layers (with modifications)	Complex task with different domain	When dataset differs greatly from ImageNet, different classes, dataset is large, and task is complex

AlexNet

Input: 224x224x3 image
Activiations: ReLU after each CONV and FC layer
Optimizer: SGD with Momentum
Regularization: Dropout in FC1 and FC2
Total Trainable Parameters: ~60 million
Traninig settings: Nvidia GTX 580 3BG GPUs for 6 days

GoogleNet

Accurary: top-5 test erorr rate of 6.7%
Close to human level performance
22 layer deep CNN
Optimizer: RMSProp
Total Trainable Parameters: ~4 million (Significantly reduced)
A novel inception module was introduced

GoogleNet

Inecption Module

Inception Module

Use filters with different size together
Use different types of layers (CONV, POOL etc.) together
It leads to better performance and efficiency but complicated architecture.

1X1 Convolution

Input image ( $6 \times 6 \times 1$ ), 1x1 kernel, and output can be declared as:

X= \begin{bmatrix} 100&100&100&0&0&0\\ 100&100&100&0&0&0\\ 100&100&100&0&0&0\\ 100&100&100&0&0&0\\ 100&100&100&0&0&0\\ 100&100&100&0&0&0 \end{bmatrix}, \quad K=\begin{bmatrix}3\end{bmatrix}

Y = K * X

Y= \begin{bmatrix} 300&300&300&0&0&0\\ 300&300&300&0&0&0\\ 300&300&300&0&0&0\\ 300&300&300&0&0&0\\ 300&300&300&0&0&0\\ 300&300&300&0&0&0 \end{bmatrix}

For channel reduction with a 1x1 convolution, each spatial location $(i,j)$ is a vector:

\mathbf{x}_{i,j} \in \mathbb{R}^{256}

One 1x1 layer with 128 filters is a matrix:

W \in \mathbb{R}^{128 \times 256},\quad \mathbf{b} \in \mathbb{R}^{128}

At each location, output channels are computed by matrix multiplication:

\mathbf{z}_{i,j}=W\mathbf{x}_{i,j}+\mathbf{b},\quad \mathbf{y}_{i,j}=\mathrm{ReLU}(\mathbf{z}_{i,j})

So the shape changes as:

64\times64\times256 \;\xrightarrow{\;1\times1\;\text{Conv (128 filters)}+\mathrm{ReLU}\;} 64\times64\times128

If we flatten all spatial positions ( $64\times64=4096$ ):

X_{\text{flat}} \in \mathbb{R}^{4096\times256},\quad Y_{\text{flat}}=\mathrm{ReLU}\left(X_{\text{flat}}W^T+\mathbf{1}\mathbf{b}^T\right) \in \mathbb{R}^{4096\times128}

Inception V2 and V3

V1 (GoogleNet): Replace one 5x5 conv with two stacked 3x3 conv layers.
- Number of parameters: $5^2=25$ vs. $2\times3^2=18$ (about 28% reduction)
V2: Factorize an $n\times n$ $n \times n$ conv into $1\times n$ $1 \times n$ and $n\times1$ $n \times 1$ convs.
- For $3\times3$ : $3^2=9$ vs. $3+3=6$ (about 33% reduction)
V3: Use more aggressive factorization and branch design (e.g., $1\times7$ $1 \times 7$ and $7\times1$ $7 \times 1$ ), plus efficient grid-size reduction.
- Improves the accuracy-efficiency tradeoff while keeping computation manageable

ResNet

Deep Residual Networks, skip connections, and identity mappings

Enabled the development of the much deeper networks
ResNet is composed of residual blocks were introduced to address the vanishing gradient problem in deep networks.
- Degradation problem: adding more layers eventually have negative effect on the final performance

ResNet

Git Tricks for Onboarding to a New Codebase

April 10, 2026 · 3 min read

Gracefullight

Owner

High-Churn Files

git log --format=format: --name-only --since="1 year ago" | sort | uniq -c | sort -nr | head -20

Shows the most frequently changed files in the last year.
These files often represent areas of the codebase with the highest maintenance burden.
The top files can be cross-analyzed with bug hotspots to identify the highest-risk parts of the system.

Code Ownership and Bus Factor

git shortlog -sn --no-merges

Shows the number of commits by each author, excluding merge commits.
If one person accounts for more than 60% of commits, the project may have a bus factor risk.
If a top contributor has not been active in the last 6 months, it may indicate a maintenance gap.
With only 3 out of 30 contributors active over the past year, this suggests a knowledge discontinuity caused by developer turnover.
However, if the team uses squash merges, the commit history may be misleading for this analysis.

Bug Hotspots

Shows the top 20 files with the most bug-related commits.
By comparing this list with the high-churn files, we can identify code that is both frequently changed and bug-prone.
While the accuracy depends on the quality of commit messages, even an approximate bug hotspot map can still be useful.

Development Velocity: Acceleration or Stagnation

git log --format='%ad' --date=format:'%Y-%m' | sort | uniq -c

Monthly commit counts provide a visual view of project activity over time.
A consistent or increasing commit frequency suggests healthy development.
A sudden drop, such as a 50% decrease in commits within a month, may signal the departure of key contributors or a shift in project focus.
A sustained decline over 6-12 months suggests a loss of team momentum, while periodic spikes followed by stagnation may indicate a batch-style release pattern.
In one real-world case, a CTO recognized from a commit velocity chart that a specific point in time aligned with the departure of a senior engineer.
This data reflects not just code activity, but team dynamics.

Reverts, Hotfixes, and Firefighting Signals

git log --oneline --since="1 year ago" | grep -iE 'revert|hotfix|emergency|rollback'

Measures the frequency of urgent fixes and recovery actions.
A few incidents per year are normal, but incidents every two weeks may signal a lack of trust in the deployment process.
This often indicates deeper issues such as unstable tests, the absence of a staging environment, or complex rollback procedures.
A result of zero may indicate either a stable codebase or poorly labeled commit messages.
Crisis patterns tend to be clearly visible, and their mere presence is often enough to assess operational reliability.

IQC 006

March 30, 2026 · 15 min read

Gracefullight

Owner

Boolean Functions

$f: \{0,1\}^n \to \{0,1\}$

x	$f_0$	$f_1$	$f_2$	$f_3$
0	0	0	1	1
1	0	1	0	1

$n=1$
Constant: smae output for all inputs ( $f_0$ and $f_3$ ).
Balanced: outputs 0 for exactly half, 1 for the other half ( $f_1$ and $f_2$ ).
In the worst case, need $2^{n-1}+1$ quries to decide which type $f$ is exponetial in $n$ .

\begin{array}{c|cccccccccccccccc} x & {\color{orange}{f_0}} & f_1 & f_2 & {\color{blue}{f_3}} & f_4 & {\color{blue}{f_5}} & {\color{blue}{f_6}} & f_7 & f_8 & {\color{blue}{f_9}} & {\color{blue}{f_{10}}} & f_{11} & {\color{blue}{f_{12}}} & f_{13} & f_{14} & {\color{orange}{f_{15}}} \\ \hline 00 & {\color{orange}0} & 0 & 0 & {\color{blue}0} & 0 & {\color{blue}0} & {\color{blue}0} & 0 & 1 & {\color{blue}1} & {\color{blue}1} & 1 & {\color{blue}1} & 1 & 1 & {\color{orange}1} \\ 01 & {\color{orange}0} & 0 & 0 & {\color{blue}0} & 1 & {\color{blue}1} & {\color{blue}1} & 1 & 0 & {\color{blue}0} & {\color{blue}0} & 0 & {\color{blue}1} & 1 & 1 & {\color{orange}1} \\ 10 & {\color{orange}0} & 0 & 1 & {\color{blue}1} & 0 & {\color{blue}0} & {\color{blue}1} & 1 & 0 & {\color{blue}0} & {\color{blue}1} & 1 & {\color{blue}0} & 0 & 1 & {\color{orange}1} \\ 11 & {\color{orange}0} & 1 & 0 & {\color{blue}1} & 0 & {\color{blue}1} & {\color{blue}0} & 1 & 0 & {\color{blue}1} & {\color{blue}0} & 1 & {\color{blue}0} & 1 & 0 & {\color{orange}1} \end{array}

The Quantum Oracle

Traditional Oracle: black box that computes $f$ .
Query complexity: number of queries to the oracle needed to solve a problem.

$x \xrightarrow{O_f} f(x)$

Quantum Oracle: unitary operation that encodes $f$ .

$U_f \ket{x}\ket{y} = \ket{x}\ket{y \oplus f(x)}$

First register: input $x$ .
Second register: auxiliary qubit initialized to $\ket{0}$ or $\ket{1}$ .
Oracle don't change $x$ $x$ , but flips $y$ $y$ if $f(x)=1$ $f (x) = 1$ .
- $0 \oplus 0 = 0$
- $0 \oplus 1 = 1$
- $1 \oplus 0 = 1$
- $1 \oplus 1 = 0$
$f(x) = 0$ : $y$ unchanged.
$f(x) = 1$ : $y$ flipped.
Example:
- $\ket{x}\ket{0}$
- $U_f \ket{x}\ket{0} = \ket{x}\ket{0} \oplus \ket{f(x)} = \ket{x}\ket{f(x)}$ .
- if $f(x) = 0$ : $\ket{x}\ket{0}$ .
- if $f(x) = 1$ : $\ket{x}\ket{1}$
so that it ignores the input $x$ and only flips the second qubit if $f(x)=1$ .
it can be reversed: $(y \oplus f(x)) \oplus f(x) = y$ .

Phase Kickback

Prepare the scratch qubit in $\ket{-} = \frac{1}{\sqrt{2}}(\ket{0} - \ket{1})$ $∣ - ⟩ = \frac{1}{2} (∣ 0 ⟩ - ∣ 1 ⟩)$ .
- $U_f \ket{x}\ket{-} = \frac{1}{\sqrt{2}}(U_f \ket{x} \ket{0} - U_f \ket{x} \ket{1})$
- $= \frac{1}{\sqrt{2}}(\ket{x} \ket{0 \oplus f(x)} - \ket{x}\ket{1 \oplus f(x)})$
- if $f(x) = 0 \rightarrow \\ \frac{1}{\sqrt{2}}(\ket{x}\ket{0 \oplus 0} - \ket{x}\ket{1 \oplus 0}) \\ = \frac{1}{\sqrt{2}}(\ket{x}\ket{0} - \ket{x}\ket{1}) \\ = \ket{x} \frac{1}{\sqrt{2}}(\ket{0} - \ket{1}) \\ = \ket{x}\ket{-}$ .
- if $f(x) = 1 \rightarrow \\ \frac{1}{\sqrt{2}}(\ket{x}\ket{0 \oplus 1} - \ket{x}\ket{1 \oplus 1}) \\ = \frac{1}{\sqrt{2}}(\ket{x}\ket{1} - \ket{x}\ket{0}) \\ = - \ket{x} \frac{1}{\sqrt{2}}(\ket{0} - \ket{1}) \\ = -\ket{x}\ket{-}$ $f (x) = 1 \to \frac{1}{2} (∣ x ⟩ ∣ 0 \oplus 1 ⟩ - ∣ x ⟩ ∣ 1 \oplus 1 ⟩) = \frac{1}{2} (∣ x ⟩ ∣ 1 ⟩ - ∣ x ⟩ ∣ 0 ⟩) = - ∣ x ⟩ \frac{1}{2} (∣ 0 ⟩ - ∣ 1 ⟩) = - ∣ x ⟩ ∣ - ⟩$ .
  - it doesn't matter where the phase is, it can be moved around:
  - $\alpha(\ket{\psi} \otimes \ket{\phi}) = \alpha\ket{\psi} \otimes \ket{\phi} = \ket{\psi} \otimes \alpha\ket{\phi}$ .
if we put the second register to $\ket{-}$ , $f(x)$ will be encoded in the phase of the first register.

$\ket{x}\ket{-}\mapsto (-1)^{f(x)} \ket{x} \ket{-}$

The function's output has been "kicked back" into the phase of the first register, while the second register remains unchanged.

The Deutsch Problem

Given a boolean function $f:\{0,1\}^{n} \to \{0,1\}$ , determine if $f$ is constant or balanced.
Constant: same output for all inputs.
Balanced: outpus 0 for half the inputs and 1 for the other half.

Deutsch's Algorithm

Prepare $\ket{0}\ket{1}$ .
Apply $H$ to both qubits $\to \ket{+}\ket{-}$ .
Apply oracle $U_f$ .
Phase kickback encodes $f$ in the input phase.
Apply $H$ to input qubit, then measure.

One query to the oracle is sufficient to determine if $f$ is constant or balanced.
if measure 0: $f$ is constant.
if measure 1: $f$ is balanced.

Deustch-Jozsa Algorithm

The direct generalization for any $n$ -bit boolean function.

Input register $n$ qubits: Initialize qubits in $\ket{0}$ and apply $H$ gate to each one.
Scratch Qubit $1$ qubit: Initialize in $\ket{1}$ with $X$ and then apply an $H$ gate.
Oracle: Apply $U_f$ to the input and scratch registers (All qubits).
Final Hadamards: Apply $H$ to input qubits.
Measurement: Measure the input register.

if all qubits returned $0$ : $f$ is constant.
if any qubit returned $1$ : $f$ is balanced.

Deutsch-Jozsa Algorithm

How it works

The initial state is

$\ket{00 \cdots 0} \ket{1}$

After applying $H$ gates to the input and scratch registers, we get

$\ket{00 \cdots 0} \rightarrow \frac{1}{\sqrt{2^n}} \left( \ket{00 \cdots 0} + \ket{00 \cdots 1} + \cdots + \ket{11 \cdots 1} \right)$

The input register is in a superposition of a computational states containing all possible inputs to $n$ -bit string.
The last qubit hasn't changed from $n = 1$ $n = 1$ case, so it is in the state $\ket{-}$ $∣ - ⟩$ .
- $H\ket{1} = \ket{-}$
The definition of the oracle was generic for any bit string: $U_f \ket{x}\ket{-} = (-1)^{f(x)} \ket{x} \ket{-}$
The phase-kickback puts a phase in front of each term in the input register that depends on the output of the function $f$ $f$ : $\frac{1}{\sqrt{2^n}}\big(\;(-1)^{f(00\ldots0)}|00\ldots 0\rangle+ \cdots +(-1)^{f(11\ldots1)}|11\ldots 1\rangle\;\big)$ $\frac{1}{2 ^{n}} ((- 1)^{f (00 \dots 0)} ∣00 \dots 0 ⟩ + \dots + (- 1)^{f (11 \dots 1)} ∣11 \dots 1 ⟩)$
- The scratch qubit remains in $\ket{-}$ , so we can ignore it for the rest of the algorithm.
Constant case
- if $f$ is constant, then all the phases are the same, either $+1$ or $-1$ :
- $f(00 \cdots 0) = f(00 \cdots 1) = \cdots = f(11 \cdots 1) = 0$
- $f(00 \cdots 0) = f(00 \cdots 1) = \cdots = f(11 \cdots 1) = 1$
- The phse in front of every computational state is the same, Either $+1$ or $-1$ .
- Before the second application of the $H$ gates, the state of the input register is:
- $\pm \frac{1}{\sqrt{2^n}} \left( \ket{00 \cdots 0} + \ket{00 \cdots 1} + \cdots + \ket{11 \cdots 1} \right)$
- In measurement, in either case, the probability to obtain $P(00 \cdots 0) = 1$
- A constant function deterministically returns all zeros with a single query.
Balanced case
- we are promised the function is either constant or balanced, so there are equal number of $+1$ and $-1$ phases. so we don't need to consider this case.
- If the measurement produces anything but all zeros, we know with certainty the function is not constant, so it must be balanced.
- There are a lot of balanced function, but half the terms in the superposition will exactly have a $-1$ phase.
- It's clearly orthogonal to the state with all ones in superposition.
- Appllying $H$ 's will change the state to some other superpostiion, or perhaps a unique computational state.
- But, orthogonality to $\ket{00 \cdots 0}$ must remain.
- A balenced function deterministically returns a state with at least one entry as $1$ , with a single query.

One quantum query vs $2^{n-1} + 1$ classical queries. It is an algorithm that puts all inputs into superposition at once, encodes the function values as phases, and then uses interference to distinguish between constant and balanced functions.

Buildling Oracles

Multi-Controlled and Anti-Controlled Gates

build $U_f$ $U_{f}$ that applies $X$ $X$ to the output qubit exactly when $f(x) = 1$ $f (x) = 1$ .
- if $n=2$ , the multi-controlled $X$ gate is a Toffoli gate.
- $x=11$ $x = 11$ is the only input that gives $f(x) = 1$ $f (x) = 1$ , so we can use a Toffoli gate with controls on the first two qubits and target on the output qubit.
  - This is because the Toffoli gate will flip the output qubit if and only if both control qubits are $\ket{1}$ , which corresponds to the input $x=11$ .
- if $x=00$ $x = 00$ is the only input that gives $f(x) = 1$ $f (x) = 1$ , we can use an anti-controlled Toffoli gate, which applies $X$ $X$ to the target qubit if both control qubits are $\ket{0}$ $∣ 0 ⟩$ .
  - This is because the anti-controlled Toffoli gate will flip the output qubit if and only if both control qubits are $\ket{0}$ , which corresponds to the input $x=00$ .
  - Applying $X$ gates to the two control qubits, then applying a Toffoli gate, and then applying $X$ gates again to the control qubits will effectively create an anti-controlled Toffoli gate.

Anti-Controlled Gates

A multi-controlled $X$ gates ( $C^nX$ ) flips the target only all control qubits are $\ket{1}$ .
To target a specific input $x$ $x$ :
- Place $X$ gates on each qubit $i$ where $x_i = 0$ . (anti-control)
- Apply $C^nX$ .
- Undo the $X$ gates.
For $n$ qubits, it can be generalized to perform an $X$ gate (or any $U$ ) with $n$ control qubits, requireing $n-1$ extra scratch qubits and $2(n-1)$ Toffoli gates.
Whenever scratch qubits are invoked, always see a symmetric pattern of gates.
$U_{f8} U_{f1} \ket{x}\ket{y} = \ket{x}\ket{y \oplus f_1(x) \oplus f_8(x)} = \ket{x}\ket{y \oplus f_9(x)}$

CCCX

The computation is done using the scratch qubits.
The answer is copied to the target or output register
The computation is inverted to reset the scratch qubits to $\ket{0}$ .

called "uncomputation", ensures the scratch qubits are returned to their initial state.
no input or output qubits are entangled with the scratch qubits at the end of the algorithm.

Implementing Deutsch-Jozsa

def random_oracle(n):
    # Circuit object to hold the gates
    circuit = qiskit.QuantumCircuit(n + 1)

    # With 50% probability, return a constant oracle
    if np.random.randint(0, 2):
        qasm = ""
        # another 50:50 chance of it being a 1 instead of 0 oracle
        if np.random.randint(0, 2):
            qasm += f"x q[{n-1}];"
        return qasm, "constant"

    # A balanced function has half the inputs as 0
    # Randomly select where those are
    zero_strings = np.random.choice(range(2**n),int(2**(n-1)),replace=False)

    for string in zero_strings:

        # Convert base 10 to 2
        bitstring= f"{string:0b}"

        # X gates for 0 locations
        for xi, bit in enumerate(reversed(bitstring)): # enumerate iterates through the list as well as the index in the list
            if bit == "1":
                circuit.x(xi)

        # C^n X gate
        circuit.mcx(list(range(n)), n)

        # X gates for 0 locations
        for xi, bit in enumerate(reversed(bitstring)):
            if bit == "1":
                circuit.x(xi)

    transpiled_circuit = qiskit.transpile(circuit, basis_gates=["u1", "u3", "u2", "cx"])
    qasm = qiskit.qasm2.dumps(transpiled_circuit)[47:]
    return qasm, "balanced"

OPENQASM 2.0;
qreg q[3];
creg c[2];

x q[2];
h q[0];
h q[1];
h q[2];

/* oracle for f(00)=1 */
x q[0];
x q[1];
ccx q[0],q[1],q[2];
x q[0];
x q[1];

/* oracle for f(11)=1 */
ccx q[0],q[1],q[2];

h q[0];
h q[1];

measure q[0] -> c[0];
measure q[1] -> c[1];

OPENQASM 2.0;
qreg q[n+1];
creg c[n];

x q[n];
h q[0];
h q[1];
...
h q[n];
/* oracle U_f */
h q[0];
h q[1];
...
h q[n-1];
measure q[0] -> c[0];
...
measure q[n-1] -> c[n-1];

Bernstein-Vazirani Algorithm

Given $f_s: \{0,1\}^n \to \{0,1\}$ $f_{s} : {0, 1}^{n} \to {0, 1}$ defined as $f_s(x) = s \cdot x \mod 2$ $f_{s} (x) = s \cdot x mod 2$
- where $s$ is an unknown $n$ -bit string and $x$ is the input.
The goal is to determine the hidden string $s$ as few queries as possible.
Classically: query $f$ $f$ with input $e_0 = 00 \cdots 01$ $e_{0} = 00 \dots 01$ , $e_1 = 00 \cdots 10$ $e_{1} = 00 \dots 10$ , ..., $e_{n-1} = 10 \cdots 00$ $e_{n - 1} = 10 \dots 00$ to get each bit of $s$ $s$ .
- Total $n$ queries.
Quantum: use the exact same circuit as Deutsch-Jozsa.
- $U_{f_s} \ket{x} = (-1)^{s \cdot x} \ket{x}$ $U_{f_{s}} ∣ x ⟩ = (- 1)^{s \cdot x} ∣ x ⟩$ .
  - where we can ignore the output register in the $\ket{-}$ state.
  - For $n =1$ $n = 1$ , the state after the oracle before the final $H$ $H$ gate is:
    - $\frac{1}{\sqrt{2}} \left( \ket{0} + (-1)^{s} \ket{1} \right)$
  - which is $\ket{+}$ if $s=0$ and $\ket{-}$ if $s=1$ .
- Applying $H$ $H$ to this state returns $s$ $s$ :
  - $H \frac{1}{\sqrt{2}} \left( \ket{0} + (-1)^{s} \ket{1} \right) = \ket{s}$ .
- The measurement deterministically reveals $s$ .

n=2 case

$s = s_1s_0$ and $x = x_1 x_0$
$\rightarrow s\cdot x = s_1x_1 + s_0 x_0$

\begin{aligned} &\frac{1}{2}\big((-1) ^{s_0\cdot 0+s_1\cdot 0} \ket{00} + (-1) ^{s_0\cdot 1+s_1\cdot 0} \ket{01}+(-1) ^{s_0\cdot 0+s_1\cdot 1} \ket{10}+(-1) ^{s_0\cdot 1+s_1\cdot 1} \ket{11}\big)\\ &=\frac{1}{2}\big( \ket{00} + (-1) ^{s_0} \ket{01}+(-1) ^{s_1} \ket{10}+(-1) ^{s_1+s_0} \ket{11}\big). \end{aligned}

this factorized into:

$\frac{1}{\sqrt{2}}\big(\ket 0 + (-1)^{s_1}\ket 1\big)\otimes \frac{1}{\sqrt{2}}\big(\ket 0 + (-1)^{s_0}\ket 1\big).$

this reduces the same agument as $n=1$ for each qubit where after the final $H$ gates, the state becomes:

\ket{s_1}\otimes \ket {s_0} \equiv \ket{s_1 s_0}.

Implemnting Bernstein-Vazirani

U_{f_s} \ket{x} \ket y = \ket{x} \ket {y\oplus (s\cdot x)}

y\oplus (s\cdot x) = y\oplus s_0 x_0 \oplus s_1 x_1 \oplus \cdots \oplus s_{n-1} x_{n-1}

U_{f_s} \ket{x} \ket y = \ket{x} \ket {y\oplus s x}

$\mathbb{I}$ if $s = 0$ and $CNOT$ if $s = 1$ .
To implement $f_{s}(x) = s \cdot x$ , we can use a CNOT from qubit $i$ to the scratch qubit if $s_i = 1$ .

import numpy as np
from qiskit import QuantumCircuit, transpile
from qiskit_aer import AerSimulator


def bv_query(n, secret=None):
    # Build oracle for f_s(x) = s · x
    # q[0]..q[n-1] = input register
    # q[n] = scratch/output qubit

    if secret is None:
        value = np.random.randint(0, 2 ** n)
        secret = format(value, f"0{n}b")
    else:
        secret = secret.zfill(n)

    oracle = QuantumCircuit(n + 1, name="Uf")
    for index, bit in enumerate(reversed(secret)):
        if bit == "1":
            oracle.cx(index, n)

    return oracle, secret


def bernstein_vazirani_circuit(n, secret=None):
    # Build full Bernstein-Vazirani circuit
    oracle, secret = bv_query(n, secret)

    qc = QuantumCircuit(n + 1, n)

    # Prepare scratch qubit in |->
    qc.x(n)

    # Apply Hadamards to all qubits
    for i in range(n + 1):
        qc.h(i)

    # Apply oracle
    qc.compose(oracle, inplace=True)

    # Apply final Hadamards to input register
    for i in range(n):
        qc.h(i)

    # Measure input register
    for i in range(n):
        qc.measure(i, i)

    return qc, secret


def run_bernstein_vazirani(n, secret=None, shots=1):
    qc, secret = bernstein_vazirani_circuit(n, secret)

    simulator = AerSimulator()
    compiled = transpile(qc, simulator)
    result = simulator.run(compiled, shots=shots).result()
    counts = result.get_counts()

    measured = max(counts, key=counts.get)
    recovered = measured[::-1]

    return qc, secret, recovered, counts


# Example
qc, secret, recovered, counts = run_bernstein_vazirani(5, "01101")
print(qc.draw())
print("Secret   :", secret)
print("Measured :", recovered)
print("Counts   :", counts)

OPENQASM 2.0;
qreg q[6];
creg c[5];

x q[5];
h q[0];
h q[1];
h q[2];
h q[3];
h q[4];
h q[5];

cx q[0],q[5];
cx q[2],q[5];
cx q[3],q[5];

h q[0];
h q[1];
h q[2];
h q[3];
h q[4];

measure q[0] -> c[0];
measure q[1] -> c[1];
measure q[2] -> c[2];
measure q[3] -> c[3];
measure q[4] -> c[4];

Summary

There are $2^{2^n}$ possible Boolean functions from $\{0,1\}^n$ to $\{0,1\}$ .
A Boolean function is balanced if it outputs 1 on exactly half of the inputs, that is, on $2^{n-1}$ out of the $2^n$ possible inputs.
In Deutsch’s algorithm with $n=1$ , only 1 quantum query is needed to distinguish a constant function from a balanced function.
A quantum oracle for $f$ is defined as the unitary $U_f:\ |x\rangle|y\rangle \mapsto |x\rangle|y\oplus f(x)\rangle.$
Every XOR-based function of the form $f(x)=x_j\oplus x_k\oplus\cdots$ that depends on at least one input bit is balanced.
Using multi-controlled $X$ gates, together with anti-controls when needed, we can implement an oracle for any Boolean function.
The phase kickback multiplies the state by the phase factor $(-1)^{f(x)}$ , so the phase changes exactly when $f(x)=1$ .
If these questions were stored in a Python list called questions, then the 8th question would be questions[7].
Applying Hadamard gates to $n$ qubits initialized in $|0\rangle$ produces an equal superposition over all $2^n$ computational basis states.
In Deutsch–Jozsa for $n>1$ , measuring all input qubits as 0 means that $f$ is constant.
Bernstein–Vazirani solves the hidden-string problem $f(x)=s\cdot x$ with 1 quantum query.
A multi-controlled $X$ gate with 3 controls can be implemented using scratch qubits and 4 Toffoli gates.
Applying $U_{f_1}$ followed by $U_{f_2}$ yields an oracle whose action on the scratch qubit corresponds to $f_1(x)\oplus f_2(x)$ .
Uncomputation resets scratch qubits to their initial states by applying the inverse of the computation, which means reversing the order of the steps.
Multi-controlled gates with more than one control can be decomposed into simpler gates, but in general this requires more than just $CX$ and $X$ ; single-qubit gates are also needed.

CNN 006

March 28, 2026 · 4 min read

Gracefullight

Owner

Data Preparation

Small Dataset (Range: 100 to 100,000 samples)
- Train/Valid/Test: 60/20/20
- Train/Test: 70/30
Large Dataset (Range: 500,000 to 1M+ samples)
- Train/Valid/Test: 98%/10,000/10,000
- usaully more traning data is get better performance.
Rule of Thumb: Validation and Test set should com from the same distribution.

Bias and Variance

Bias: A value that allows to shift the activation function to left or right to better fit the data.
- With bias the curve/line will not always pass through origin
- can get a better fit to training data
Variance: The sensitivity of the model to small fluctuations in the training data.
- The change in prediction accuracy of ML model between training data and test data.
- Model with high variance pays a lot of attention to tranining data and does not generalize on the data which is has not seen before.
- With high variance, model perform very well on training data but poorly on test data.

Bias and Variance

High Bias
- High training error, underfitting
- Validation/test error nearly same as train error
- Potential things to try:
  - Increase features
  - Make ML model more complicated
  - Decrease Regularation parameters
High Variance
- Low tranining error, overfitting
- High validation/test error
- Potential things to try:
  - Increase dataset size
  - Reduce input features
  - Increasing Regularization parameter

Accuracy

Bayesian Optimal Error (BOE): Best optimal error that can be achieved by any model on a given dataset.
Human-Level performance:
- Humans are very good at a lot of tasks
- Can get labelled data from humans to help improve the model performance
- Gain insights from manual error analysis

Regularization

a technique which makes slight modifications to the learning algorithm such that the model generalizes better on the unseen data.
Update the loss/cost function by adding a regularization term
- $\text{Loss function} = \text{Loss} + \text{Regularization term}(\lambda)$
- Due to $\lambda$ , the weight matrices will decrease, assuming a neural network with smaller weight matrices leads to simpler model.
- Regularization penalizes the weights matrices of the nodes
L2 regularization
L1 regularization
Dropout

L2 Regularization

$\text{Cost function} = \text{Loss} + \frac{\lambda}{2m} \sum_{j=1}^{n_x} w_j^2$

$\lambda$ is a hyper-parameter
as weight decay, as it forces the weight to decay towards zero, but not exactly zero.

L1 Regularization

$\text{Cost function} = \text{Loss} + \frac{\lambda}{2m} \sum_{j=1}^{n_x} |w_j|$

Penalize the absolute value of the $w$
Weight may reduce to zero
Useful in compressing a model (sparse model)

Dropout

It produces good reuslts and most popular regularization technique in deep learning.
At every iteration, it randomly selects and drops some nodes and remove all the connections to those nodes.
Each iteration has a different set of nodes.

Data Augmentation

Simple way to reduce overfitting is to increase size of tranining dataset.
By creating more sample using the existing set and applying the following simple operations
- Flip
- Rotate
- Scale
- Crop
- Translate
- Gaussian Noise

Cutout

Simple regularization technique of randomly masking out square regions of input during training.
Patch size: 16x16 to 64x64
Fill value: 0 or mean pixel value
Patches: 1-3 per image

Mixup

Trains a neural network on convex combinations of pairs of examples and their labels.
It regularizes the neural network to favor simple lienar behavior in-between training examples.
Image A ( $\lambda = 0.55$ ) + Image B ( $\lambda = 0.45$ ) = Blended Output

CutMix

Patches are cut and pasted among training images, where the ground truth labels are also mixed proportionally to the area of the patches.
Image A + Image B (Patch) = Pasted Patch Output.

Random Agumentation

A set of augmentation operations is defined, and a random subset of these operations is applied to each image during training.
Identity, AutoContrast, Equalize, Rotate, Solarize, Color, Posterize, Contrast, Brightness, Sharpness, ShearX/Y, TranslateX/Y

Generative Adversarial Networks (GANs)

Able to generate images which look similar to the original ones
Proven to be very effective in data augmentation, especially when the dataset is small.

Neural Style Transfer

Using CNN to separate style
Transfer style to different image

리서치 가이드

March 28, 2026 · One min read

Gracefullight

Owner

주제 선정

박사과정 4년 중 1.5년은 읽기만 해야한다.
Area of Interest (AOI)는 내가 좋아하는 걸 해야 버틸 수 있다.
그리고 그 1.5년 뒤에는 Research Question이 나올 것이다.
4년 내의 논문에서 Future Work를 취합하자.

좋은 연구 주제

Government
- Originality of content, document
- How can differntiate from generated content?

Master leading to PhD

UTS: feature research students

논문

Abstract가 흥미있으면 Conclusion을 읽고, 그 다음 관련 내용이면 전체를 읽어야한다.
공식을 처음부터 이해하려고 하지 말자.
2022년 이후 (4개년까지만) 논문을 읽어야한다.
- 왜냐면 박사과정이 4-5년이니까.

CNN 005

March 28, 2026 · 3 min read

Gracefullight

Owner

Computer Vision

Classification
Classification with Localization
Object Detection

-	ANN	CNN
Input	1D vector	3D tensor (height, width, channels)
Connections	Fully connected	Local connections (receptive fields)
Overfitting	Prone to overfitting	Less prone to overfitting

Convolutional Neural Networks (CNN)

Convolutional Layer (CONV)
Pooling Layer (POOL)
Fully Connected Layer (FC)

LENET-5

Convulutional Layer (CONV)

The first layer to extraact features from an input image
Core buildling block of a CNN
Convolutions are basic operation in this layer
A number of filters (e.g. edge detectors) are applied to the input image.

Padding

Padding is used to control the spatial size of the output feature maps.
Negative values at the edges can naturally arise because of padding, and they usually are not a big problem because activation functions and later layers come afterward.
Input Matrix dimension: $n \times n \times c$ (height, width, channels)
Filter size: $f \times f$
Padding ( $P$ ): 1, number of pixels added to the border of the input
$(n \times n) * (f \times f) \to (n + 2P - f + 1) \times (n + 2P - f + 1)$ $(n \times n) * (f \times f) \to (n + 2 P - f + 1) \times (n + 2 P - f + 1)$
- Example: $5 \times 5$ input with $3 \times 3$ filter and padding of 1 results in a $5 \times 5$ output feature map.
if input and output matrix dimensions are the same, then $P = \frac{f - 1}{2}$ .
Valid padding ( $P = 0$ ): No Padding
Same padding ( $P = \frac{f - 1}{2}$ ): Output size and input size is same, this requires appropriate padding.

Stride

It is the number of pixels by which slide the filter across the input image.

No Padding Strides	Stride with Padding

Github: vdumoulin/conv_arithmetic
Input Matrix dimension: $n \times n$
Filter size: $f \times f$
Padding: $P$
Stride: $S$
Output Size = $\left\lfloor \frac{n + 2P - f}{S} + 1 \right\rfloor \times \left\lfloor \frac{n + 2P - f}{S} + 1 \right\rfloor$ $⌊ \frac{n + 2 P - f}{S} + 1 ⌋ \times ⌊ \frac{n + 2 P - f}{S} + 1 ⌋$
- Example: Input Matrix dimension: $5 \times 5$ , Filter size: $3 \times 3$ , Padding: $1$ , Stride: $2$ results in an output size of $2 \times 2$ .

Pooling Layer (POOL)

Down sampling operation which reduces the dimensionality of a matrix.
Reduces the number of parameters for large image, but retain the valuable information.
Max pooling
Average pooling
Sum pooling

Fully Connected Layer (FC)

a traditional Multi-layer Perception (MLP) layer
For multi-class classification, usually Softmax activation is used.
Softmax ensures the output.
Output of the CONV and POOL layers represent a high level features of the Input image.
The FC layer takes these features to classify the input image into the desired output classes.

CNN 004

March 26, 2026 · 6 min read

Gracefullight

Owner

Logistic Regression as Neural Network

if $y = 1$ $y = 1$
- $L = -\log(\hat{y})$
- if $\hat{y} \to 1$ , then $L \to 0$ (low loss)
- if $\hat{y} \to 0$ , then $L \to \infty$ (high loss)
if $y = 0$ $y = 0$
- $L = -\log(1 - \hat{y})$
- if $\hat{y} \to 0$ , then $L \to 0$ (low loss)
- if $\hat{y} \to 1$ , then $L \to \infty$ (high loss)

Gradient Descent

it is an iterative approach for error correction in a machene learning model
Find $w$ and $b$ that will minimize $GD(w, b)$ (requires Loss/Cost function)

Initialize $w$ and $b$
Perform Forward pass operation/calculations
Compute Loss/Cost function $L(a, y)$
Compute change in $w$ and $b$ (Take the partial derivative of the cost function with respect to Weights and bias $dw$ and $db$ )
Update $w$ and $b$ ( $w := w - \alpha dw$ and $b := b - \alpha db$ )
Repeat from Step 2 with new values of $w$ and $b$ for 'n' number of iterations.

$\alpha$ is the learning rate (hyperparameter) that controls how much we are adjusting the weights and bias of our model with respect to the loss gradient. It is a small positive value (e.g., 0.01, 0.001) that determines the step size at each iteration while moving toward a minimum of the loss function.

Gradient Descent Types

Batch Gradient Descent (BGD)
Stochastic Gradient Descent (SGD)
Mini-batch Gradient Descent (MBGD)

Batch Gradient Descent (BGD)

Process each input sample and find the cost
Find the average cost oveer all input samples
Update $w$ and $b$ and repeat the steps for "n" epochs(iterations)

Disadvantages:
- It uses the complete dataset to calculate the gradients at every steps
- Slow when training data is large
- Difficult to find the learning rate
- Difficult to ascertain the number of epochs(iterations)

Stochastic Gradient Descent (SGD)

Due to the random nature, the algorithm is much less regular than BGD.

Process a random input sample and find the cost.
Update $w$ and $b$ , and repeat the steps for "n" iterations on the training samples.

Advantages:
- Computes gradient based on single input sample, which is memory efficient.
- Much faster compared to BGD.
- Possible to train on large datasets.
- Randomness is helpful to escape local minima.
Disadvantages:
- Might not reach the optimal value, but very close to it.
  - Simulated annealing: Reduce the learning rate gradually
  - Create a Learning Schedule to determine the learning rate at each iteration.

Mini-batch Gradient Descent (MBGD)

Divide the tranining set into mini-batches of size $n$ (e.g., 64, 128, 256).
Process all the samples in a mini-batch and find the average cost
Update $w$ and $b$ , and repeat the steps for "n" iterations/epoches on the traning samples.

Advantages:
- Computes gradient based on small sets of input smaple
- Much faster compared to BGD.
- Possible to train on large dataset.
- Performance boost on matrix operations using GPUs.
- Might not reach the optional value but, very close to it and possibly better than SGD.
Disadvantages:
- It may be harder to escape the local minima compared to SGD.

Exponentially Weighted Averages

One of the popular algorithm for smoothing sequential data (time series data), aka. moving average.
Weight the number of observations and using their average

V_0 = 0 \\ V_1 = 0.9 \cdot V_0 + 0.1 \cdot \theta_1 \\ V_2 = 0.9 \cdot V_1 + 0.1 \cdot \theta_2 \\ V_3 = 0.9 \cdot V_2 + 0.1 \cdot \theta_3 \\ \vdots \\ V_t = 0.9 \cdot V_{t-1} + 0.1 \cdot \theta_t \\ V_t = \beta \cdot V_{t-1} + (1 - \beta) \cdot \theta_t

$V_t$ is approximate average over $\approx \frac{1}{1 - \beta}$ time steps.

For $\beta = 0.9$ , $V_t$ is average over the last 10 time steps.
For $\beta = 0.98$ , $V_t$ is average over the last 50 time steps.
For $\beta = 0.5$ , $V_t$ is average over the last 2 time steps.

Optimizers

SGD with Moementum

At iteration $t$ :

Calculate $dw$ and $db$ on the current mini-batch (Hyper parameters: $\alpha$ and $\beta$ )
Update the velocity:
- $V_{dw} = \beta V_{dw} + (1 - \beta) dw \rightarrow V_t = \beta V_{t-1} + (1 - \beta) \theta_t$
- $V_{db} = \beta V_{db} + (1 - \beta) db$
Update parameters:
- $w := w - \alpha V_{dw}$
- $b := b - \alpha V_{db}$

RMSProp

Root Mean Square Propagation.
Unpublished adaptive learning method by Geoffrey Hinton.
Reduces oscillation but in a different way than Momentum.
Divides the learning rate by an exponentially decaying average of squared gradients.
Calculate $dw$ $d w$ and $db$ $d b$ on the current mini-batch
- $S_{dw} = \beta S_{dw} + (1 - \beta) dw^2$
- $S_{db} = \beta S_{db} + (1 - \beta) db^2$
Update parameters:
- $w := w - \alpha \frac{dw}{\sqrt{S_{dw}} + \epsilon}$
- $b := b - \alpha \frac{db}{\sqrt{S_{db}} + \epsilon}$
- $\epsilon$ is a small number to prevent division by zero (e.g., $10^{-8} \text{ to } 10^{-10}$ )

Adam

Adaptive Moment Estimation
Combination of RMSProp and Momentum
Work well for a wide range of non-convex optimization problems in machine learning.
Calculate $dw$ $d w$ and $db$ $d b$ on the current mini-batch
- $V_{dw} = \beta_1 V_{dw} + (1 - \beta_1) dw \leftarrow Momentum, \beta_1$
- $V_{db} = \beta_1 V_{db} + (1 - \beta_1) db$
- $S_{dw} = \beta_2 S_{dw} + (1 - \beta_2) dw^2 \leftarrow RMSProp, \beta_2$
- $S_{db} = \beta_2 S_{db} + (1 - \beta_2) db^2$
Update parameters:
- $w := w - \alpha \frac{V_{dw}}{\sqrt{S_{dw}} + \epsilon}$
- $b := b - \alpha \frac{V_{db}}{\sqrt{S_{db}} + \epsilon}$
- $\epsilon$ is a small number to prevent division by zero (e.g., $10^{-8} \text{ to } 10^{-10}$ )
Hyper parameter guide:
- $\alpha = 0.001$
- $\beta_1 = 0.9$ : Momentum term
- $\beta_2 = 0.999$ : Moving weighted average
- $\epsilon = 10^{-8}$ : To prevent division by zero
ensmallen.org

Learning Rate Decay

Speed-up the learning algorighm by slowing decreasing the learning rate $\alpha$ as the number of epochs increases.

Activation Functions

Getting the output of a layer in a neural network and applying a non-linear function to it.
- Sigmoid: $\sigma(x) = \frac{1}{1 + e^{-x}}$
- Tanh: $\tanh(x) = \frac{2}{1 + e^{-2x}} - 1$
- Used for binary classification in the output layer.
ReLU: $A(x) = \max(0, x)$ $A (x) = max (0, x)$
- Rectified Linear Unit
- Avoids and rectifies vanishing gradient problem
- Best used in hidden layers
- Computationally less expensive than sigmoid and tanh
Softmax: $S(x_i) = \frac{e^{x_i}}{\sum_{j} e^{x_j}}$ $S (x_{i}) = \frac{e ^{x_{i}}}{\sum _{j} e ^{x_{j}}}$
- Turns numbers in probabilities that sum to 1.
- Used for multi-class classification in the output layer.

IQC 005

March 21, 2026 · 10 min read

Gracefullight

Owner

Encoding

Basis encoding: binary srings -> computational basis states
Amplitude encoding: data values -> amplitudes of a quantum state
Angle encoding: data values -> rotation angles on individual qubits

Basis Encoding

$x = b_1 b_2 ... b_n \rightarrow \ket{x} = \ket{b_1 b_2 ... b_n}$

applying $X$ gates to flip qubits from $\ket{0}$ to $\ket{1}$ based on the binary representation of the data.
$\ket{00} \rightarrow \ket{01}$ (X gate on the second qubit)
$\ket{00} \rightarrow \ket{10}$ (X gate on the first qubit)
Dataset superposition: to encode set $\{01, 11\}$ ${01, 11}$
- $\ket{S} = \frac{1}{\sqrt{2}}(\ket{01} + \ket{11})$
- it requires Hadamard and/or controlled gates
Hadamard transform
- $\ket{H^{\otimes n}}\ket{0}^{\otimes n} = \frac{1}{\sqrt{2^n}} \sum_{x=0}^{2^n-1} \ket{x}$

import pennylane as qml

x = [int(b) for b in '1011']

def circuit(x):
  qml.BasisEmbedding(features=x, wires=range(len(x)))

dev = qml.device('default.qubit', wires=4)
qnode = qml.QNode(circuit, dev)

print(qml.draw(qml.transforms.decompose(qnode), show_all_wires=True)(x))

Amplitude Encoding

$x = [x_0, x_1, ..., x_{N-1}] \rightarrow |\psi\rangle = \sum_{i=0}^{2^N-1} x_k |k\rangle$

The most qubit-efficient encoding method, since $n$ qubits can encode $2^n$ amplitudes.

\begin{align*} \ket{\bf 0} &\equiv \ket{000} \\ \ket{\bf 1} &\equiv \ket{001} \\ \ket{\bf 2} &\equiv \ket{010} \\ \ket{\bf 3} &\equiv \ket{011} \\ \ket{\bf 4} &\equiv \ket{100} \\ \ket{\bf 5} &\equiv \ket{101} \\ \ket{\bf 6} &\equiv \ket{110} \\ \ket{\bf 7} &\equiv \ket{111} \end{align*}

The state can be rewritten more explicitly (for 3 qubits)

$\ket{\psi} = x_0 \ket{000} + x_1 \ket{001} + x_2 \ket{010} + x_3 \ket{011} + x_4 \ket{100} + x_5 \ket{101} + x_6 \ket{110} + x_7 \ket{111}$

For $N$ data points, we need $n = \log_2(N)$ qubits.
Converting to binary
- with $n$ $n$ qubits, the computational basis states range from
  - $\ket{0}$ to $\ket{2^n - 1}$
- $[x_0, x_1, ..., x_{N-1}]$ can be mapped to $x_0\ket{000} + x_1\ket{001} + \cdots + x_{N-1}\ket{(N-1)}$ .
- $b_i = \left\lfloor \frac{k}{2^i} \right\rfloor \bmod 2$ .
if $k = 5$ $k = 5$
- $b_0 = \left\lfloor \frac{5}{2^0} \right\rfloor \bmod 2 = 1$
- $b_1 = \left\lfloor \frac{5}{2^1} \right\rfloor \bmod 2 = 0$
- $b_2 = \left\lfloor \frac{5}{2^2} \right\rfloor \bmod 2 = 1$
- gives $b_2 b_1 b_0 = 101$ (binary representation of 5)
- $\ket{\bf 5} = \ket{101}$

Angle Encoding

Each classical value controls the roation angle of a qubit gate.
$x = [x_1, x_2, ..., x_n]$
$RY(x_0) \ket{0} \otimes RY(x_1) \ket{0} \otimes ... \otimes RY(x_{n-1}) \ket{0}$
where $RY(\theta) = \begin{bmatrix} \cos(\theta/2) & -\sin(\theta/2) \\ \sin(\theta/2) & \cos(\theta/2) \end{bmatrix}$ is the $Y$ rotation gate.

Summary of Encoding Methods

Method	Qubit cost	Eample Classical Data	Use Case	Circuit Complexity
Basis Encoding	1 qubit per bit	Binary strings ('1011')	Binary data, configs	Easy
Amplitude Encoding	$\log_2(N)$ qubits	Vector of real numbers	QML, optimization	Hard
Angle Encoding	1 qubit per data point	Vector of angles	Variational / hybrid algorithms	Easy

Pennylane: BasisEmbedding, AmplitudeEmbedding, AngleEmbedding
Qiskit: initialize
PyTKET: StatePreparationBox

Amplitude Encoding Circuit

The algorithm to prepare an arbitrary vector of length $2^n$ takes roughly that many gates. (e.g. $2^3 \approx 10$ )

wires = [0, 1, 2]

def circuit(x):
  qml.AmplitudeEmbedding(x, wires)

dev = qml.device("default.qubit", wires=wires)
qnode = qml.QNode(circuit, dev)

# Random vector of length 8 (for 3 qubits)
x = np.random.rand(8)
# Normalize the vector to have unit length (required for quantum states)
x = x/np.sqrt(np.sum(x**2))

qml.draw_mpl(qnode)(x);
qml.draw_mpl(qml.transforms.decompose(qnode), decimals=3)(x);

Amplitute encoding circuit

from pytket.circuit import StatePreparationBox
from pytket.circuit.display import render_circuit_jupyter as draw

state_circ = pytket.circuit.Circuit(3)

# Example 3-qubit state to prepare
w_state = 1 / np.sqrt(3) * np.array([0, 1, 1, 0, 1, 0, 0, 0])

w_state_box = StatePreparationBox(w_state)
state_circ.add_gate(w_state_box, [0, 1, 2])

draw(state_circ)

pytket.transform.Transform.DecomposeBoxes().apply(state_circ)
draw(state_circ)

Amplitute encoding circuit with PyTKET

Different algorithms for Rotation gate

Pennylane: Transformation of quantum states using uniformly controlled rotations $Ry(\theta) = e^{-\theta Y / 2} = \begin{bmatrix} \cos(\theta/2) & -\sin(\theta/2) \\ \sin(\theta/2) & \cos(\theta/2) \end{bmatrix}$
PyTKET: Synthesis of Quantum Logic Circuits $R_y(\theta) = e^{-\theta \pi Y / 2} = \begin{bmatrix} \cos(\theta \pi / 2) & -\sin(\theta \pi / 2) \\ \sin(\theta \pi / 2) & \cos(\theta \pi / 2) \end{bmatrix}$
Qiskit: Quantum Circuits for Isometries
There are concreate algorithms for encoding classical data into quantum states, but this process is generally computationally expensive.
It usually requires classical preprocessing, and the resulting circuit can be quite deep.
In general, amplitude encoding an arbitary large vector is expensive.
"a quantum computer can store an exponential amount of data" should be always considered together with the cost of state preparation.
The real advantage of quantum computers does not appear for every problem, but only for specific problems that are well-suited to them.
- Problems where state preparation is natural or efficient.
- Problems that involve simulating quantum systems themselves.
- Structured linear algebra, optimization, or sampling problems.
- Problems where the input is already given as a quantum state.

Binary Logic Gates

a	b	sum	carry
0	0	0	0
0	1	1	0
1	0	1	0
1	1	0	1

Sum $s = a \oplus b$ (XOR)
Carry $c = a \cdot b$ (AND)

Declare registers
Apply opertions
Read the result

Full-Adder in Binary Logic

Handles three inpus ( $a$ , $b$ , and carry-in $c_{in}$ ) and produces two outputs (sum $s$ and carry-out $c_{out}$ ).

$a$	$b$	$c_{\text{in}}$	$s$	$c_{\text{out}}$
0	0	0	0	0
0	0	1	1	0
0	1	0	1	0
0	1	1	0	1
1	0	0	1	0
1	0	1	0	1
1	1	0	0	1
1	1	1	1	1

$s = a \oplus b \oplus c_{in}$
$c_{out} = (a \cdot b) \oplus (a \cdot c_{in}) \oplus (b \cdot c_{in})$
A full-adder = two half-adders + an OR gate
Chaning full-adders producs a ripple-carry adder for multi-bit numbers.

Quantum Arithmetic

All quantum gates must be Unitary (Invertable)
A classical half-adder discards input information after computing the sum and carry.
- This irreversibility is forbidden in quantum circuits.
The computation must be done in a way that preserves all input information.
- XOR and AND operations must be implemented using reversible gates (e.g. Toffoli gate).
- XOR $\rightarrow$ CNOT gate
- AND $\rightarrow$ Toffoli gate

Classical op	Quantum gate
XOR $(a \oplus b)$	$\text{CNOT}(a, b) = \ket{a} \otimes \ket{b \oplus a}$
AND $(a \cdot b)$	$\text{CCX}\ket{a}\ket{b}\ket{c} = \ket{a} \ket{b} \ket{(a \cdot b) \oplus c}$

With a third qubit initialized to $\ket{0}$ , we can compute the AND of $a$ and $b$ without losing information about $a$ and $b$ .

Half-Adder

Half Adder with two work qubits

Half Adder with one work qubit

Full-Adder

Full Adder

$C_{out} = (a \cdot b) \oplus (a \cdot c_{in}) \oplus (b \cdot c_{in})$ $s = a \oplus b \oplus c_{in}$

Reduced Full Adder

$CCX(a, b, c_{out}) \rightarrow CNOT(a, b) \rightarrow CCX(b, c_{in}, c_{out}) \rightarrow CNOT(b, c_{in})$

Ripple Carry Adder

Two full-adder circuits in sequence, overlapping on the carry qubit ("rippling" the carry through the circuit).
CDKM adder
- Carry-out is the majority vote of the three inputs

MAJ circuit

MAJ: Majority, computes carry in-place using only 2 CNOTs + 1 Toffoli
- $(c, b, a) \rightarrow (c \oplus a, b \oplus a, a \cdot b \oplus a \cdot c \oplus b \cdot c)$
UMA: UnMajority-and-Add, reverses MAJ but overwrites $b$ with the sum.
- $c \oplus a, b \oplus a, c_{out} \rightarrow c, a\oplus b\oplus c, s$

UMA circuit

CDKM Ripple-Carry Adder: For two n-bit numbers, chain MAJ/UMA pairs, sharing the carry qubit.
- CDKMRippleCarryAdder

CDKM circuit

Quantum Multiplication

-	-	-	-	-	-
			1	1	0
x			1	0	1
			1	1	0
		0	0	0
+	1	1	0
	1	1	1	0	0

Binary multiplication is "shift and add"
- for each 1-bit in the multiplier, add a shifted version of the multiplicand to the result.
- for each 0-bit, add nothing (or add a zero vector).
In quantum, each partial addition is a conditional adder , every internal gate is controlled by a bit of the multiplier.

$CU = \ket{0}\bra{0} \otimes I + \ket{1}\bra{1} \otimes U$

Superposition

Same adder circuit works on superposition of inputs

$\text{ADDER} \ket{2} (\ket{1} + \ket{3}) \ket{0} = \ket{2} (\ket{1} \ket{3})(\ket{3} \ket{5})$

With one adder, both $2 + 1$ and $2 + 3$ are computed simultaneously.
measurement collapses the superposition, we can only ever see one result per run. Either $2 + 1 = 3$ or $2 + 3 = 5$ .
Amplifying the probability of amplitudes corresponding to correct answers is a key part of quantum algorithms.

Summary

Basis encoding maps each classical bit to a qubit, with 0 mapped to $|0\rangle$ and 1 mapped to $|1\rangle$ .
To encode 1011, we need X gates on qubits 0, 2, and 3 (assuming qubit 0 is the most significant bit).
Applying an H gate (Hadamard gate) to each qubit puts the system in a uniform superposition over all basis states.
Amplitude encoding of an arbitrary vector of size $2^n$ generally requires $O(2^n)$ gates.
In angle encoding, each classical value $x_i$ becomes the rotation angle of an RY gate on the $i$ -th qubit.
Using the binary conversion formula, the decimal number 5 maps to $|101\rangle$ for 3 qubits.
In short, in quantum addition, the sum is an XOR operation, and the carry is an AND operation.
A quantum half-adder must keep the original inputs to remain reversible and maintain its unitary nature.
The Toffoli gate is defined as: $|a\rangle|b\rangle|c\rangle \mapsto |a\rangle|b\rangle|c \oplus a \cdot b\rangle$ .
A full-adder should produce a carry-out as follows: $c_{\text{out}} = a \cdot b \oplus a \cdot c_{\text{in}} \oplus b \cdot c_{\text{in}}$ .
The following operation is reversible: $|a\rangle|b\rangle|c\rangle \mapsto |a\rangle|a \oplus b\rangle|c \oplus a \cdot b\rangle$ .
For addition "in superposition," measuring the output register yields exactly one of the partial sums, chosen probabilistically.
The MAJ circuit computes the majority function as: $a \cdot b \oplus a \cdot c \oplus b \cdot c$ .
The QASM snippet for a half-adder uses ccx q[0], q[1], q[2] to compute the carry ( $a \cdot b$ ) and a cx q[0], q[1] (or cx q[1], q[0]) gate to compute the sum ( $a \oplus b$ ).
Quantum multiplication can be implemented via conditional addition, one per multiplier bit.

IQC 004

March 10, 2026 · 10 min read

Gracefullight

Owner

Superdense Coding

counterintuitive protocol that allows us to send 2 bits of classical information by sending only 1 qubit, using pre-shared entanglement
it's often contextualized as a game with Alice and Bob.
Alice and Bob share and entangled pair of qubits
Pre-shared Entanglement (Shared Bell pair)
- $|\Phi^+\rangle = \frac{1}{\sqrt{2}}(|00\rangle + |11\rangle)$
Alice wants to send 2 bits of classical information (00, 01, 10, or 11) to Bob
- For 00: Apply Identity $I$ (do nothing)
- For 01: Apply Pauli-X $X$ (bit flip)
- For 10: Apply Pauli-Z $Z$ (phase flip)
- For 11: Apply $XZ$ (bit and phase flip)
Alice sends her one qubit to Bob (Bob possesses both qubits of the entangled pair)
Bob decodes the information
- Inverts the entanglement operation on the two qubits (CNOT followed by Hadamard)
- Measures each of the qubits
- The outcome of this measurement reveals the two bits Alice encoded.
$|\Phi^+\rangle$ : is the most common and convenient form (Bell State) of entanglement and is also maximally entangled state despite single-qubit unitary gates.
- Easy to build circuits for it
- Symmetric and has nice properties that make it ideal for protocols like superdense coding and teleportation
- Other Bell states can be generated from $|\Phi^+\rangle$ by applying single-qubit gates.

Superdense Coding Circuit

Alice's message $mn$ , where each of $m$ and $n$ is a bit (0 or 1)
Alice's operation can be represented as $Z^m X^n \otimes \mathbb{I} | \Phi^+\rangle$ $Z^{m} X^{n} \otimes I ∣ Φ^{+} ⟩$
- $m=0 \Rightarrow Z^0 = I$ (no phase flip)
- $m=1 \Rightarrow Z^1 = Z$ (phase flip)
- $n=0 \Rightarrow X^0 = I$ (no bit flip
- $n=1 \Rightarrow X^1 = X$ (bit flip)
- $00 \rightarrow I$
- $01 \rightarrow X$
- $10 \rightarrow Z$
- $11 \rightarrow ZX$
$Z^m|b\rangle = (-1)^{mb}|b\rangle$ (phase flip)
$X^n|a\rangle = |a \oplus n\rangle$ (xor operation)

$X^n \left(\frac{1}{\sqrt{2}} (|00\rangle + |11\rangle)\right) = \frac{1}{\sqrt{2}} \left( |n0\rangle + |(1\oplus n)1\rangle \right)$

$n=0$ : $\frac{1}{\sqrt{2}} (|00\rangle + |11\rangle) = |\Phi^+\rangle$
$n=1$ : $\frac{1}{\sqrt{2}} (|10\rangle + |01\rangle)$

$Z^m X^n \left(\frac{1}{\sqrt{2}} (|00\rangle + |11\rangle)\right) = \frac{1}{\sqrt{2}} \left( (-1)^{mn}|n0\rangle + (-1)^{m(1\oplus n)}|(1\oplus n)1\rangle \right)$

$|n0\rangle$ : first qubit is $n$ , $(-1)^{mn}$ is the phase factor
$|(1\oplus n)1\rangle$ : first qubit is $1\oplus n$ , $(-1)^{m(1\oplus n)}$ is the phase factor
Bob's decoding operation is the inverse of the entanglement operation
- $CNOT |a\rangle |b\rangle = |a\rangle |a \oplus b\rangle$

\begin{aligned} CNOT\, Z^m X^n \left(\frac{1}{\sqrt{2}} (|00\rangle + |11\rangle)\right) &= \frac{(-1)^{mn}}{\sqrt{2}} \left(|nn\rangle + (-1)^m|(1 \oplus n)n\rangle\right) \\ &= \frac{(-1)^{mn}}{\sqrt{2}} \left(|n\rangle + (-1)^m|1 \oplus n\rangle\right)|n\rangle \end{aligned}

which is seperable state.
First qubit is $|n\rangle + (-1)^m|1 \oplus n\rangle$ and second qubit is $|n\rangle$ .
$n, m \in \{0, 1\}$ , then apply Hadamard to the first qubit:

H\, \frac{1}{\sqrt{2}} \left(|n\rangle + (-1)^m|1 \oplus n\rangle\right) = (-1)^{mn}|m\rangle

second qubit is $|n\rangle$ , so the final state is:

$(-1)^{mn}(-1)^{nm}|m\rangle|n\rangle = |mn\rangle$

Superdense QASM

OPENQASM 2.0;
qreg q[2];
creg c[2];

// Prepare Bell pair
h q[0];
cx q[0],q[1];

// Alice's encoding
// For message 11: apply X and Z
x q[0];
z q[0];

// Alice sends q[0] to Bob (in simulation, we just proceed)

// Bob's decoding
cx q[0],q[1];
h q[0];

// Measure both qubits
measure q[0] -> c[0];
measure q[1] -> c[1];

QASM Programming

Custom Gates

// params: parameters for the gate (e.g., rotation angles)
// q_args: quantum arguments (qubits the gate acts on)
gate NAME(parameters) q_args {
  // Define gate operations
}

gate bell a,b {
  h a;
  cx a,b;
}

qreg q[2];
creg c[2];

// Use the custom bell gate
bell q[0], q[1];

// Measure the qubits
measure q[0] -> c[0];
measure q[1] -> c[1];

Toffoli Gate

// Toffoli gate (CCX)
gate ccx a, b, c {
  h c;
  cx b, c;
  tdg c;
  cx a, c;
  t c;
  cx b, c;
  tdg c;
  cx a, c;
  t b;
  t c;
  cx a, b;
  h c;
  t a;
  tdg b;
  cx a, b;
}

Rotation Gates

// Rotation around Y-axis
gate ry_deg(theta) q {
  ry(theta/180 * pi) q;
}

// Rotation around X-axis
gate rx_deg(theta) q {
  rx(theta/180 * pi) q;
}

Qiskit

IBM

import qiskit

superdense = qiskit.QuantumCircuit(2, 2)
superdense.draw()

superdense.h(0)
superdense.cx(0, 1)
superdense.draw()

superdense qiskit

# Alice's encoding for message '11'
superdense.x(0)
superdense.z(0)
superdense.draw()

superdense qiskit 11

# Bob unentangles the two qubits (reverses the entangling gate)
superdense.cx(0,1)
superdense.h(0)

# The measurement pattern is `measure(qubit to measure, classical bit to store result)`
superdense.measure(0,0)
superdense.measure(1,1)
superdense.draw()

superdense qiskit measure

from qiskit.providers.basic_provider import BasicSimulator

sim = BasicSimulator()
# run the circuit on the simulator with 1 shot (execute the circuit once)
result = sim.run(superdense, shots=1).result().get_counts()

print(result)
# {'11': 1}

Custom Gates in Qiskit

bell = qiskit.QuantumCircuit(2, name='bell')
bell.h(0)
bell.cx(0, 1)
bell_gate = bell.to_gate()

c = qiskit.QuantumCircuit(2)
c.append(bell_gate, [0, 1])
c.draw()

Decompose a custom gate

ccxgate = qiskit.circuit.library.CCXGate()
ccx = qiskit.QuantumCircuit(3)
ccx.append(ccxgate, [0, 1, 2])

print(ccx)
print(ccx.decompose())

ccx

Cirq

Google

import cirq

alice = cirq.NamedQubit('Alice')
bob = cirq.NamedQubit('Bob')

superdense = cirq.Circuit()
superdense.append([
  cirq.H(alice),
  cirq.CNOT(alice, bob),
])

superdense.append([
  cirq.X(alice),
  cirq.Z(alice),
])

superdense.append([
  cirq.CNOT(alice, bob),
  cirq.H(alice),
])

superdense.append(cirq.measure(alice, bob, key='received'))

print(superdense)

simulator = cirq.Simulator()
result = simulator.run(superdense, repetitions=1)
print(result)
# received=1, 1

Pennylane

Xanadu

import pennylane as qml

def entangle():
    qml.Hadamard(wires=0)
    qml.CNOT(wires=[0, 1])

print(qml.draw(entangle)())

device = qml.device("default.qubit", wires=[0, 1])

# Wrap the quantum function as a QNode
entangle_qnode = qml.QNode(my_circuit, device)

# Set the number of shots to 10
entangle_qnode = qml.set_shots(entangle_qnode, shots=10)

import numpy as np

state = np.array([1, 1j], dtype=complex)

state = state / np.linalg.norm(state)

def teleport(state):
    # Ensure "state" is loaded into the first qubit
    # (otherwise qubits would start in |0>)
    qml.StatePrep(state, wires=0)

    # Shared entanglement between qubits 1 and 2
    qml.Hadamard(wires=1)
    qml.CNOT(wires=[1, 2])

    # Alice's operation:
    # CNOT from Alice's input qubit to her half of the Bell pair,
    # then Hadamard on the input qubit
    qml.CNOT(wires=[0, 1])
    qml.Hadamard(wires=0)

    # Measurement (store the classical outcomes)
    m0 = qml.measure(0)
    m1 = qml.measure(1)

    # Bob's conditional correction operations
    qml.cond(m1, qml.PauliX)(wires=2)
    qml.cond(m0, qml.PauliZ)(wires=2)

    # Return Bob's qubit as a density matrix
    return qml.density_matrix(wires=2)

print(qml.draw(teleport)(state))

Quantum Teleportation

-	Superdense Coding	Teleportation
Consumes	Entanglement	Entanglement
Sends	1 qubit	2 bits
Transmits	2 bits	1 qubit

if there is entanglement:
- we can use it to send 2 bits of classical information by sending 1 qubit (superdense coding)
- we can use it to send 1 qubit of quantum information by sending 2 bits of classical information (teleportation)
Alice wants to send a qubit $|\psi\rangle = \alpha|0\rangle + \beta|1\rangle$ to Bob, but only has a classical channel.

Protocol

Alice and Bob share $|\Phi^+\rangle$ (pre-shared entanglement)
Alice applies $CNOT$ (her qubit -> her Bell half), then $H$ , then measures both, obtaining 2 classical bits (00, 01, 10, or 11)
Alice sends $mn$ classically to Bob
Bob applies $X^n Z^m$ to his qubit, recovering $|\psi\rangle$

For 00: Apply Identity $I$ (do nothing)
For 01: Apply Pauli-X $X$ (bit flip)
For 10: Apply Pauli-Z $Z$ (phase flip)
For 11: Apply $XZ$ (bit and phase flip)
Bob now has a qubit in the state Alice wanted to send $|\psi\rangle$ without Alice ever sending a qubit directly to Bob.

Teleportation Circuit

$|\psi\rangle \otimes |\Phi^+\rangle = \left(\alpha |0\rangle + \beta |1\rangle\right) \otimes \frac{1}{\sqrt{2}} \left(|00\rangle + |11\rangle\right)$

$|\psi\rangle \otimes |\Phi^+\rangle = \frac{1}{\sqrt{2}} \left(\alpha |0\rangle |00\rangle + \alpha |0\rangle |11\rangle + \beta |1\rangle |00\rangle + \beta |1\rangle |11\rangle\right)$

Alice applies a $CNOT$ $CNOT$ gate with her qubit $|\psi\rangle$ $∣ ψ ⟩$ as control and her half of the Bell pair $|\Phi^+\rangle$ $∣ Φ^{+} ⟩$ as target.
- $\beta |1\rangle |00\rangle$ becomes $\beta |1\rangle |10\rangle$ (target flips when control is 1)
- $\beta |1\rangle |11\rangle$ becomes $\beta |1\rangle |01\rangle$ (target flips when control is 1)

$\frac{1}{\sqrt{2}} \Big(\alpha |0\rangle |00\rangle + \alpha |0\rangle |11\rangle + \beta |1\rangle |10\rangle + \beta |1\rangle |01\rangle\Big)$

Alice then applies a Hadamard gate to her qubit ( $|\psi\rangle$ $∣ ψ ⟩$ ).
- $H|0\rangle = \frac{|0\rangle + |1\rangle}{\sqrt{2}}$
- $H|1\rangle = \frac{|0\rangle - |1\rangle}{\sqrt{2}}$
To prepare for measurement, Bell measurement is performed on Alice's two qubits, which can be expressed as:

\begin{aligned} \frac{1}{\sqrt{2}} \Bigg[&\alpha \left(\frac{|0\rangle + |1\rangle}{\sqrt{2}}\right) \otimes |00\rangle + \alpha \left(\frac{|0\rangle + |1\rangle}{\sqrt{2}}\right) \otimes |11\rangle \\ &+ \beta \left(\frac{|0\rangle - |1\rangle}{\sqrt{2}}\right) \otimes |10\rangle + \beta \left(\frac{|0\rangle - |1\rangle}{\sqrt{2}}\right) \otimes |01\rangle \Bigg] \\ =\, &\frac{1}{2} \Big[ |00\rangle (\alpha |0\rangle + \beta |1\rangle) + |01\rangle (\alpha |1\rangle + \beta |0\rangle) \\ &+ |10\rangle (\alpha |0\rangle - \beta |1\rangle) + |11\rangle (\alpha |1\rangle - \beta |0\rangle) \Big] \end{aligned}

Alice measures her two qubits, resulting in one of four possible outcomes corresponding to the classical bits $|mn\rangle$ $∣ mn ⟩$ where $m, n \in \{0, 1\}$ $m, n \in {0, 1}$
- Alice's two qubits: $|00\rangle, |0 1\rangle, |1 0\rangle, |1 1\rangle$
- Bob's qubit is in a state that depends on Alice's measurement outcome

$X^n Z^m |\psi\rangle$

mn	Bob's state	Bob applies
00	$\alpha \lvert 0\rangle + \beta \lvert 1\rangle$	$\mathbb{I}$
01	$\alpha \lvert 1\rangle + \beta \lvert 0\rangle$	$X$
10	$\alpha \lvert 0\rangle - \beta \lvert 1\rangle$	$Z$
11	$\alpha \lvert 1\rangle - \beta \lvert 0\rangle$	$XZ$

Channels

Classical Channel: carries bits (fibre, radio, paper, ...)
Quantum Channel: carries qubits (can also carry bits)
- Quantum channels can emulate classical ones, but not vice versa
Teleportation: classical channel + pre-shared entanglement -> effective quantum channel

Summary

In superdense coding, by sending only one qubit and using pre-shared entanglement, Alice can transmit two classical bits of information.
The operation $X^n \lvert a \rangle = \lvert a \oplus (n \bmod 2) \rangle$ correctly defines the effect of applying the $X$ gate $n$ times.
The tensor product of two identity operators is the identity operator on the composite space. In symbols, where the subscript is the dimension: $I_2 \otimes I_2 = I_4$ .
The state $\lvert \phi \rangle = \frac{1}{\sqrt{2}}(\lvert 0 \rangle + \lvert 1 \rangle)$ is an eigenstate of the Pauli- $X$ gate with eigenvalue $+1$ .
In the teleportation protocol, the classical communication channel is used to transmit two classical bits from Alice to Bob.
Three qubits are required for quantum teleportation.
After the teleportation protocol completes, Bob has a qubit in the state $\lvert \psi \rangle$ .
Of QASM, Qiskit, Cirq, and PennyLane, PennyLane is the only quantum language that spells out the full name of the Hadamard gate for its built-in gates.

TIM 004

March 9, 2026 · 3 min read

Gracefullight

Owner

Innovation Ecosystem

A network of organizations, people and resources that interact with each other to develop and support new ideas, technologies and businesses.
Innovation Ecosystem
- Start-up companies
- Medical Centers
- Mature Companies
- City, Regional and State Organizations
- Providers of Support Services (Legal, Accounting, etc)
- Venture Capital Funds
- Colleges & universities

Type of Innovation Ecosystem

Corporate innovation ecosystem
Digital innovation ecosystem
City-based innovation ecosystem
High-tech SMEs centered ecosystem (Small and Medium-sized Enterprises)
University-based ecosystem
Incubators and Accelerators ecosystems
Regional and National innovation ecosystems
Social innovation ecosystems

The core -> New Innovation Initiatives -> Startup Ecosystem -> Customers

Roles and Activites across Innovation Ecosystem

Leadership:
- Ecosystem leader
  - ecosystem governance: decipher roles, coordinate interactions, orchestrate resource flows
  - forging partnerships: attract & link partners, create collaboration, stimulate complementary
  - platform management: build platform, open platform, orchestrate compleentors
  - value management: decipher bases of value, create & capture value
- Dominator: integrate actors
Direct Value Creation:
- Supplier: supply components
- Assembler: assemble components
- Complementor: provide complementarities
- User: define need, provide ideas, purchase & use
Value Support:
- Expert: generte knowledge, provide expertise, transfer technology
- Champion: build connections, provide access to markets
Entrepreneur Ecosystem:
- Entrepreneur: co-locate, set-up network
- Sponsor: give resources, co-develop offering, link to other actors
- Regulator: provide favorable conditions

Systems Thinking

a holistic and non-linear approach to problem solving that focuses on the interactions and patterns within an entire system
emphasizes relationships and feedback loops instead of analysing individual parts in isolation

Fishbone Diagram

common categories to help think broadly about the possible causes include:
People: skils, training, communication, motivation
Process / Methods: procedures, or workflows
Machines / Techonology: tools, equipment, software
Materials: resources or inputs used
Environment: workplace or external conditions
Mangement / Policies: decisions, rules, or leadership

Innovation Networks

connected systems of people working together toward shared objectives, often through internal teams and external partners such as suppliers, universities, accelerators, customers, and startups.
is also part of innovation ecosystem

Types of Innovation Networks

Enterpreneur-based
Internal project teams
Internal enterpreneur networks
Communities of practice: can involve players inside and across different organizations
Spatial clusters: like Silicon Valley, Boston, etc
Sectoral networks: bring different players together because they share a common sector
New product or process develoment consortium
New technology development consortium
Emerging standards: Exploring and estabilishing standards around innovative technologies
Supply chain learning
Learning networks
Recombinant innovation networks: Cross-sectoral groupings that alloow for networking across boundaries and the transfer of ieads.
Managed open innovation networks
User networks
Innovation markets
Crowdfunding and new resource approaches

Stakeholder Analysis

High power, highly interested people (Engage Closely)
High power, less interested people (Keep Satisfied)
Low power, Highly interested people (Keep Informed)
Low power, less interested people (Monitor)

Transfer Learning​

Transfer Learning Benefits​

VGG-16​

AlexNet​

GoogleNet​

Inecption Module​

1X1 Convolution​

Inception V2 and V3​

ResNet​

High-Churn Files​

Code Ownership and Bus Factor​

Bug Hotspots​

Development Velocity: Acceleration or Stagnation​

Reverts, Hotfixes, and Firefighting Signals​

Boolean Functions​

The Quantum Oracle​

Phase Kickback​

The Deutsch Problem​

Deustch-Jozsa Algorithm​

How it works​

Buildling Oracles​

Multi-Controlled and Anti-Controlled Gates​

Implementing Deutsch-Jozsa​

Bernstein-Vazirani Algorithm​

n=2 case​

Implemnting Bernstein-Vazirani​

Summary​

Data Preparation​

Bias and Variance​

Accuracy​

Regularization​

L2 Regularization​

L1 Regularization​

Dropout​

Data Augmentation​

Cutout​

Mixup​

CutMix​

Random Agumentation​

Generative Adversarial Networks (GANs)​

Neural Style Transfer​

주제 선정​

좋은 연구 주제​

Master leading to PhD​

논문​

Computer Vision​

Convolutional Neural Networks (CNN)​

Convulutional Layer (CONV)​

Padding​

Stride​

Pooling Layer (POOL)​

Fully Connected Layer (FC)​

Logistic Regression as Neural Network​

Gradient Descent​

Gradient Descent Types​

Batch Gradient Descent (BGD)​

Stochastic Gradient Descent (SGD)​

Mini-batch Gradient Descent (MBGD)​

Exponentially Weighted Averages​

Optimizers​

SGD with Moementum​

RMSProp​

Adam​

Learning Rate Decay​

Activation Functions​

Encoding​

Basis Encoding​

Amplitude Encoding​

Angle Encoding​

Summary of Encoding Methods​

Amplitude Encoding Circuit​

Different algorithms for Rotation gate​

Binary Logic Gates​

Full-Adder in Binary Logic​

Quantum Arithmetic​

Half-Adder​

Full-Adder​

Ripple Carry Adder​

Quantum Multiplication​

Superposition​

Transfer Learning

Transfer Learning Benefits

VGG-16

AlexNet

GoogleNet

Inecption Module

1X1 Convolution

Inception V2 and V3

ResNet

High-Churn Files

Code Ownership and Bus Factor

Bug Hotspots

Development Velocity: Acceleration or Stagnation

Reverts, Hotfixes, and Firefighting Signals

Boolean Functions

The Quantum Oracle

Phase Kickback

The Deutsch Problem

Deustch-Jozsa Algorithm

How it works

Buildling Oracles

Multi-Controlled and Anti-Controlled Gates

Implementing Deutsch-Jozsa

Bernstein-Vazirani Algorithm

n=2 case

Implemnting Bernstein-Vazirani

Summary

Data Preparation

Bias and Variance

Accuracy

Regularization

L2 Regularization

L1 Regularization

Dropout

Data Augmentation

Cutout

Mixup

CutMix

Random Agumentation

Generative Adversarial Networks (GANs)

Neural Style Transfer

주제 선정

좋은 연구 주제

Master leading to PhD

논문

Computer Vision

Convolutional Neural Networks (CNN)

Convulutional Layer (CONV)

Padding

Stride

Pooling Layer (POOL)

Fully Connected Layer (FC)

Logistic Regression as Neural Network

Gradient Descent

Gradient Descent Types

Batch Gradient Descent (BGD)

Stochastic Gradient Descent (SGD)

Mini-batch Gradient Descent (MBGD)

Exponentially Weighted Averages

Optimizers

SGD with Moementum

RMSProp

Adam

Learning Rate Decay

Activation Functions

Encoding

Basis Encoding

Amplitude Encoding

Angle Encoding

Summary of Encoding Methods

Amplitude Encoding Circuit

Different algorithms for Rotation gate

Binary Logic Gates

Full-Adder in Binary Logic

Quantum Arithmetic

Half-Adder

Full-Adder

Ripple Carry Adder

Quantum Multiplication

Superposition