Skip to main content

CNN 007

· 6 min read

Transfer Learning

  • Knowledge acquired while solving one task, can be used to solve related tasks
  • Similar to the way humans apply knowledge acquired from on task to solve a new but similar, related task.

Transfer Learning Benefits

  1. Less training data required: Model trained using a large (similar) dataset can be used as a starting point for training on a smaller dataset.
  2. Faster training: Traninig can converage faster, du the use to existing knowledge (weights) to start with rather than from scratch.
  3. Better model generalization: Model is trained to identify features which can be applied to new contexts.

VGG-16

ApproachDescriptionUse CaseWhen to Use
Use Pre-trained ModelUse ImageNet pre-trained model without any additional trainingDogs & cats classificationWhen dataset distribution is similar to ImageNet with few samples
Train FC Layers OnlyUse CONV layers for feature extraction, train FC layers onlyDifferent class classification on similar domainWhen dataset is similar to ImageNet but different classes with limited samples
Train Last CONV + FC LayersTrain last CONV layers (specialized features) and FC layersSignificantly different data distribution domainWhen dataset differs greatly from ImageNet, different classes, and limited samples
Train All CONV + FC LayersTrain all CONV layers and FC layers (with modifications)Complex task with different domainWhen dataset differs greatly from ImageNet, different classes, dataset is large, and task is complex

AlexNet

  • Input: 224x224x3 image
  • Activiations: ReLU after each CONV and FC layer
  • Optimizer: SGD with Momentum
  • Regularization: Dropout in FC1 and FC2
  • Total Trainable Parameters: ~60 million
  • Traninig settings: Nvidia GTX 580 3BG GPUs for 6 days

GoogleNet

  • Accurary: top-5 test erorr rate of 6.7%
  • Close to human level performance
  • 22 layer deep CNN
  • Optimizer: RMSProp
  • Total Trainable Parameters: ~4 million (Significantly reduced)
  • A novel inception module was introduced

GoogleNet

Inecption Module

Inception Module

  • Use filters with different size together
  • Use different types of layers (CONV, POOL etc.) together
  • It leads to better performance and efficiency but complicated architecture.

1X1 Convolution

Input image (6×6×16 \times 6 \times 1), 1x1 kernel, and output can be declared as:

X=[100100100000100100100000100100100000100100100000100100100000100100100000],K=[3]X= \begin{bmatrix} 100&100&100&0&0&0\\ 100&100&100&0&0&0\\ 100&100&100&0&0&0\\ 100&100&100&0&0&0\\ 100&100&100&0&0&0\\ 100&100&100&0&0&0 \end{bmatrix}, \quad K=\begin{bmatrix}3\end{bmatrix} Y=KXY = K * X Y=[300300300000300300300000300300300000300300300000300300300000300300300000]Y= \begin{bmatrix} 300&300&300&0&0&0\\ 300&300&300&0&0&0\\ 300&300&300&0&0&0\\ 300&300&300&0&0&0\\ 300&300&300&0&0&0\\ 300&300&300&0&0&0 \end{bmatrix}

For channel reduction with a 1x1 convolution, each spatial location (i,j)(i,j) is a vector:

xi,jR256\mathbf{x}_{i,j} \in \mathbb{R}^{256}

One 1x1 layer with 128 filters is a matrix:

WR128×256,bR128W \in \mathbb{R}^{128 \times 256},\quad \mathbf{b} \in \mathbb{R}^{128}

At each location, output channels are computed by matrix multiplication:

zi,j=Wxi,j+b,yi,j=ReLU(zi,j)\mathbf{z}_{i,j}=W\mathbf{x}_{i,j}+\mathbf{b},\quad \mathbf{y}_{i,j}=\mathrm{ReLU}(\mathbf{z}_{i,j})

So the shape changes as:

64×64×256    1×1  Conv (128 filters)+ReLU  64×64×12864\times64\times256 \;\xrightarrow{\;1\times1\;\text{Conv (128 filters)}+\mathrm{ReLU}\;} 64\times64\times128

If we flatten all spatial positions (64×64=409664\times64=4096):

XflatR4096×256,Yflat=ReLU(XflatWT+1bT)R4096×128X_{\text{flat}} \in \mathbb{R}^{4096\times256},\quad Y_{\text{flat}}=\mathrm{ReLU}\left(X_{\text{flat}}W^T+\mathbf{1}\mathbf{b}^T\right) \in \mathbb{R}^{4096\times128}

Inception V2 and V3

  • V1 (GoogleNet): Replace one 5x5 conv with two stacked 3x3 conv layers.
    • Number of parameters: 52=255^2=25 vs. 2×32=182\times3^2=18 (about 28% reduction)
  • V2: Factorize an n×nn\times n conv into 1×n1\times n and n×1n\times1 convs.
    • For 3×33\times3: 32=93^2=9 vs. 3+3=63+3=6 (about 33% reduction)
  • V3: Use more aggressive factorization and branch design (e.g., 1×71\times7 and 7×17\times1), plus efficient grid-size reduction.
    • Improves the accuracy-efficiency tradeoff while keeping computation manageable

ResNet

Deep Residual Networks, skip connections, and identity mappings

  • Enabled the development of the much deeper networks
  • ResNet is composed of residual blocks were introduced to address the vanishing gradient problem in deep networks.
    • Degradation problem: adding more layers eventually have negative effect on the final performance

ResNet

Git Tricks for Onboarding to a New Codebase

· 3 min read

High-Churn Files

git log --format=format: --name-only --since="1 year ago" | sort | uniq -c | sort -nr | head -20

  • Shows the most frequently changed files in the last year.
  • These files often represent areas of the codebase with the highest maintenance burden.
  • The top files can be cross-analyzed with bug hotspots to identify the highest-risk parts of the system.

Code Ownership and Bus Factor

git shortlog -sn --no-merges

  • Shows the number of commits by each author, excluding merge commits.
  • If one person accounts for more than 60% of commits, the project may have a bus factor risk.
  • If a top contributor has not been active in the last 6 months, it may indicate a maintenance gap.
  • With only 3 out of 30 contributors active over the past year, this suggests a knowledge discontinuity caused by developer turnover.
  • However, if the team uses squash merges, the commit history may be misleading for this analysis.

Bug Hotspots

git log -i -E --grep="fix|bug|broken" --name-only --format='' | sort | uniq -c | sort -nr | head -20

  • Shows the top 20 files with the most bug-related commits.
  • By comparing this list with the high-churn files, we can identify code that is both frequently changed and bug-prone.
  • While the accuracy depends on the quality of commit messages, even an approximate bug hotspot map can still be useful.

Development Velocity: Acceleration or Stagnation

git log --format='%ad' --date=format:'%Y-%m' | sort | uniq -c

  • Monthly commit counts provide a visual view of project activity over time.
  • A consistent or increasing commit frequency suggests healthy development.
  • A sudden drop, such as a 50% decrease in commits within a month, may signal the departure of key contributors or a shift in project focus.
  • A sustained decline over 6-12 months suggests a loss of team momentum, while periodic spikes followed by stagnation may indicate a batch-style release pattern.
  • In one real-world case, a CTO recognized from a commit velocity chart that a specific point in time aligned with the departure of a senior engineer.
  • This data reflects not just code activity, but team dynamics.

Reverts, Hotfixes, and Firefighting Signals

git log --oneline --since="1 year ago" | grep -iE 'revert|hotfix|emergency|rollback'

  • Measures the frequency of urgent fixes and recovery actions.
  • A few incidents per year are normal, but incidents every two weeks may signal a lack of trust in the deployment process.
  • This often indicates deeper issues such as unstable tests, the absence of a staging environment, or complex rollback procedures.
  • A result of zero may indicate either a stable codebase or poorly labeled commit messages.
  • Crisis patterns tend to be clearly visible, and their mere presence is often enough to assess operational reliability.

IQC 006

· 15 min read

Boolean Functions

f:{0,1}n{0,1}f: \{0,1\}^n \to \{0,1\}

xf0f_0f1f_1f2f_2f3f_3
00011
10101
  • n=1n=1
  • Constant: smae output for all inputs (f0f_0 and f3f_3).
  • Balanced: outputs 0 for exactly half, 1 for the other half (f1f_1 and f2f_2).
  • In the worst case, need 2n1+12^{n-1}+1 quries to decide which type ff is exponetial in nn.
xf0f1f2f3f4f5f6f7f8f9f10f11f12f13f14f15000000000011111111010000111100001111100011001100110011110101010101010101\begin{array}{c|cccccccccccccccc} x & {\color{orange}{f_0}} & f_1 & f_2 & {\color{blue}{f_3}} & f_4 & {\color{blue}{f_5}} & {\color{blue}{f_6}} & f_7 & f_8 & {\color{blue}{f_9}} & {\color{blue}{f_{10}}} & f_{11} & {\color{blue}{f_{12}}} & f_{13} & f_{14} & {\color{orange}{f_{15}}} \\ \hline 00 & {\color{orange}0} & 0 & 0 & {\color{blue}0} & 0 & {\color{blue}0} & {\color{blue}0} & 0 & 1 & {\color{blue}1} & {\color{blue}1} & 1 & {\color{blue}1} & 1 & 1 & {\color{orange}1} \\ 01 & {\color{orange}0} & 0 & 0 & {\color{blue}0} & 1 & {\color{blue}1} & {\color{blue}1} & 1 & 0 & {\color{blue}0} & {\color{blue}0} & 0 & {\color{blue}1} & 1 & 1 & {\color{orange}1} \\ 10 & {\color{orange}0} & 0 & 1 & {\color{blue}1} & 0 & {\color{blue}0} & {\color{blue}1} & 1 & 0 & {\color{blue}0} & {\color{blue}1} & 1 & {\color{blue}0} & 0 & 1 & {\color{orange}1} \\ 11 & {\color{orange}0} & 1 & 0 & {\color{blue}1} & 0 & {\color{blue}1} & {\color{blue}0} & 1 & 0 & {\color{blue}1} & {\color{blue}0} & 1 & {\color{blue}0} & 1 & 0 & {\color{orange}1} \end{array}

The Quantum Oracle

  • Traditional Oracle: black box that computes ff.
  • Query complexity: number of queries to the oracle needed to solve a problem.

xOff(x)x \xrightarrow{O_f} f(x)

  • Quantum Oracle: unitary operation that encodes ff.

Ufxy=xyf(x)U_f \ket{x}\ket{y} = \ket{x}\ket{y \oplus f(x)}

  • First register: input xx.
  • Second register: auxiliary qubit initialized to 0\ket{0} or 1\ket{1}.
  • Oracle don't change xx, but flips yy if f(x)=1f(x)=1.
    • 00=00 \oplus 0 = 0
    • 01=10 \oplus 1 = 1
    • 10=11 \oplus 0 = 1
    • 11=01 \oplus 1 = 0
  • f(x)=0f(x) = 0: yy unchanged.
  • f(x)=1f(x) = 1: yy flipped.
  • Example:
    • x0\ket{x}\ket{0}
    • Ufx0=x0f(x)=xf(x)U_f \ket{x}\ket{0} = \ket{x}\ket{0} \oplus \ket{f(x)} = \ket{x}\ket{f(x)}.
    • if f(x)=0f(x) = 0: x0\ket{x}\ket{0}.
    • if f(x)=1f(x) = 1: x1\ket{x}\ket{1}
  • so that it ignores the input xx and only flips the second qubit if f(x)=1f(x)=1.
  • it can be reversed: (yf(x))f(x)=y(y \oplus f(x)) \oplus f(x) = y.

Phase Kickback

  • Prepare the scratch qubit in =12(01)\ket{-} = \frac{1}{\sqrt{2}}(\ket{0} - \ket{1}).
    • Ufx=12(Ufx0Ufx1)U_f \ket{x}\ket{-} = \frac{1}{\sqrt{2}}(U_f \ket{x} \ket{0} - U_f \ket{x} \ket{1})
    • =12(x0f(x)x1f(x)) = \frac{1}{\sqrt{2}}(\ket{x} \ket{0 \oplus f(x)} - \ket{x}\ket{1 \oplus f(x)})
    • if f(x)=012(x00x10)=12(x0x1)=x12(01)=xf(x) = 0 \rightarrow \\ \frac{1}{\sqrt{2}}(\ket{x}\ket{0 \oplus 0} - \ket{x}\ket{1 \oplus 0}) \\ = \frac{1}{\sqrt{2}}(\ket{x}\ket{0} - \ket{x}\ket{1}) \\ = \ket{x} \frac{1}{\sqrt{2}}(\ket{0} - \ket{1}) \\ = \ket{x}\ket{-}.
    • if f(x)=112(x01x11)=12(x1x0)=x12(01)=xf(x) = 1 \rightarrow \\ \frac{1}{\sqrt{2}}(\ket{x}\ket{0 \oplus 1} - \ket{x}\ket{1 \oplus 1}) \\ = \frac{1}{\sqrt{2}}(\ket{x}\ket{1} - \ket{x}\ket{0}) \\ = - \ket{x} \frac{1}{\sqrt{2}}(\ket{0} - \ket{1}) \\ = -\ket{x}\ket{-}.
      • it doesn't matter where the phase is, it can be moved around:
      • α(ψϕ)=αψϕ=ψαϕ\alpha(\ket{\psi} \otimes \ket{\phi}) = \alpha\ket{\psi} \otimes \ket{\phi} = \ket{\psi} \otimes \alpha\ket{\phi}.
  • if we put the second register to \ket{-}, f(x)f(x) will be encoded in the phase of the first register.

x(1)f(x)x\ket{x}\ket{-}\mapsto (-1)^{f(x)} \ket{x} \ket{-}

  • The function's output has been "kicked back" into the phase of the first register, while the second register remains unchanged.

The Deutsch Problem

  • Given a boolean function f:{0,1}n{0,1}f:\{0,1\}^{n} \to \{0,1\}, determine if ff is constant or balanced.
  • Constant: same output for all inputs.
  • Balanced: outpus 0 for half the inputs and 1 for the other half.

Deutsch's Algorithm

  1. Prepare 01\ket{0}\ket{1}.
  2. Apply HH to both qubits +\to \ket{+}\ket{-}.
  3. Apply oracle UfU_f.
  4. Phase kickback encodes ff in the input phase.
  5. Apply HH to input qubit, then measure.
  • One query to the oracle is sufficient to determine if ff is constant or balanced.
  • if measure 0: ff is constant.
  • if measure 1: ff is balanced.

Deustch-Jozsa Algorithm

The direct generalization for any nn-bit boolean function.

  1. Input register nn qubits: Initialize qubits in 0\ket{0} and apply HH gate to each one.
  2. Scratch Qubit 11 qubit: Initialize in 1\ket{1} with XX and then apply an HH gate.
  3. Oracle: Apply UfU_f to the input and scratch registers (All qubits).
  4. Final Hadamards: Apply HH to input qubits.
  5. Measurement: Measure the input register.
  • if all qubits returned 00: ff is constant.
  • if any qubit returned 11: ff is balanced.

Deutsch-Jozsa Algorithm

How it works

  • The initial state is

0001\ket{00 \cdots 0} \ket{1}

  • After applying HH gates to the input and scratch registers, we get

00012n(000+001++111)\ket{00 \cdots 0} \rightarrow \frac{1}{\sqrt{2^n}} \left( \ket{00 \cdots 0} + \ket{00 \cdots 1} + \cdots + \ket{11 \cdots 1} \right)

  • The input register is in a superposition of a computational states containing all possible inputs to nn-bit string.
  • The last qubit hasn't changed from n=1n = 1 case, so it is in the state \ket{-}.
    • H1=H\ket{1} = \ket{-}
  • The definition of the oracle was generic for any bit string: Ufx=(1)f(x)xU_f \ket{x}\ket{-} = (-1)^{f(x)} \ket{x} \ket{-}
  • The phase-kickback puts a phase in front of each term in the input register that depends on the output of the function ff: 12n(  (1)f(000)000++(1)f(111)111  )\frac{1}{\sqrt{2^n}}\big(\;(-1)^{f(00\ldots0)}|00\ldots 0\rangle+ \cdots +(-1)^{f(11\ldots1)}|11\ldots 1\rangle\;\big)
    • The scratch qubit remains in \ket{-}, so we can ignore it for the rest of the algorithm.
  • Constant case
    • if ff is constant, then all the phases are the same, either +1+1 or 1-1:
    • f(000)=f(001)==f(111)=0f(00 \cdots 0) = f(00 \cdots 1) = \cdots = f(11 \cdots 1) = 0
    • f(000)=f(001)==f(111)=1f(00 \cdots 0) = f(00 \cdots 1) = \cdots = f(11 \cdots 1) = 1
    • The phse in front of every computational state is the same, Either +1+1 or 1-1.
    • Before the second application of the HH gates, the state of the input register is:
    • ±12n(000+001++111)\pm \frac{1}{\sqrt{2^n}} \left( \ket{00 \cdots 0} + \ket{00 \cdots 1} + \cdots + \ket{11 \cdots 1} \right)
    • In measurement, in either case, the probability to obtain P(000)=1P(00 \cdots 0) = 1
    • A constant function deterministically returns all zeros with a single query.
  • Balanced case
    • we are promised the function is either constant or balanced, so there are equal number of +1+1 and 1-1 phases. so we don't need to consider this case.
    • If the measurement produces anything but all zeros, we know with certainty the function is not constant, so it must be balanced.
    • There are a lot of balanced function, but half the terms in the superposition will exactly have a 1-1 phase.
    • It's clearly orthogonal to the state with all ones in superposition.
    • Appllying HH's will change the state to some other superpostiion, or perhaps a unique computational state.
    • But, orthogonality to 000\ket{00 \cdots 0} must remain.
    • A balenced function deterministically returns a state with at least one entry as 11, with a single query.

One quantum query vs 2n1+12^{n-1} + 1 classical queries. It is an algorithm that puts all inputs into superposition at once, encodes the function values as phases, and then uses interference to distinguish between constant and balanced functions.

Buildling Oracles

Multi-Controlled and Anti-Controlled Gates

  • build UfU_f that applies XX to the output qubit exactly when f(x)=1f(x) = 1.
    • if n=2n=2, the multi-controlled XX gate is a Toffoli gate.
    • x=11x=11 is the only input that gives f(x)=1f(x) = 1, so we can use a Toffoli gate with controls on the first two qubits and target on the output qubit.
      • This is because the Toffoli gate will flip the output qubit if and only if both control qubits are 1\ket{1}, which corresponds to the input x=11x=11.
    • if x=00x=00 is the only input that gives f(x)=1f(x) = 1, we can use an anti-controlled Toffoli gate, which applies XX to the target qubit if both control qubits are 0\ket{0}.
      • This is because the anti-controlled Toffoli gate will flip the output qubit if and only if both control qubits are 0\ket{0}, which corresponds to the input x=00x=00.
      • Applying XX gates to the two control qubits, then applying a Toffoli gate, and then applying XX gates again to the control qubits will effectively create an anti-controlled Toffoli gate.

Anti-Controlled Gates

  • A multi-controlled XX gates (CnXC^nX) flips the target only all control qubits are 1\ket{1}.
  • To target a specific input xx:
    • Place XX gates on each qubit ii where xi=0x_i = 0. (anti-control)
    • Apply CnXC^nX.
    • Undo the XX gates.
  • For nn qubits, it can be generalized to perform an XX gate (or any UU) with nn control qubits, requireing n1n-1 extra scratch qubits and 2(n1)2(n-1) Toffoli gates.
  • Whenever scratch qubits are invoked, always see a symmetric pattern of gates.
  • Uf8Uf1xy=xyf1(x)f8(x)=xyf9(x)U_{f8} U_{f1} \ket{x}\ket{y} = \ket{x}\ket{y \oplus f_1(x) \oplus f_8(x)} = \ket{x}\ket{y \oplus f_9(x)}

CCCX

  1. The computation is done using the scratch qubits.
  2. The answer is copied to the target or output register
  3. The computation is inverted to reset the scratch qubits to 0\ket{0}.
  • called "uncomputation", ensures the scratch qubits are returned to their initial state.
  • no input or output qubits are entangled with the scratch qubits at the end of the algorithm.

Implementing Deutsch-Jozsa

def random_oracle(n):
# Circuit object to hold the gates
circuit = qiskit.QuantumCircuit(n + 1)

# With 50% probability, return a constant oracle
if np.random.randint(0, 2):
qasm = ""
# another 50:50 chance of it being a 1 instead of 0 oracle
if np.random.randint(0, 2):
qasm += f"x q[{n-1}];"
return qasm, "constant"

# A balanced function has half the inputs as 0
# Randomly select where those are
zero_strings = np.random.choice(range(2**n),int(2**(n-1)),replace=False)

for string in zero_strings:

# Convert base 10 to 2
bitstring= f"{string:0b}"

# X gates for 0 locations
for xi, bit in enumerate(reversed(bitstring)): # enumerate iterates through the list as well as the index in the list
if bit == "1":
circuit.x(xi)

# C^n X gate
circuit.mcx(list(range(n)), n)

# X gates for 0 locations
for xi, bit in enumerate(reversed(bitstring)):
if bit == "1":
circuit.x(xi)

transpiled_circuit = qiskit.transpile(circuit, basis_gates=["u1", "u3", "u2", "cx"])
qasm = qiskit.qasm2.dumps(transpiled_circuit)[47:]
return qasm, "balanced"
OPENQASM 2.0;
qreg q[3];
creg c[2];

x q[2];
h q[0];
h q[1];
h q[2];

/* oracle for f(00)=1 */
x q[0];
x q[1];
ccx q[0],q[1],q[2];
x q[0];
x q[1];

/* oracle for f(11)=1 */
ccx q[0],q[1],q[2];

h q[0];
h q[1];

measure q[0] -> c[0];
measure q[1] -> c[1];
OPENQASM 2.0;
qreg q[n+1];
creg c[n];

x q[n];
h q[0];
h q[1];
...
h q[n];
/* oracle U_f */
h q[0];
h q[1];
...
h q[n-1];
measure q[0] -> c[0];
...
measure q[n-1] -> c[n-1];

Bernstein-Vazirani Algorithm

  • Given fs:{0,1}n{0,1}f_s: \{0,1\}^n \to \{0,1\} defined as fs(x)=sxmod2f_s(x) = s \cdot x \mod 2
    • where ss is an unknown nn-bit string and xx is the input.
  • The goal is to determine the hidden string ss as few queries as possible.
  • Classically: query ff with input e0=0001e_0 = 00 \cdots 01, e1=0010e_1 = 00 \cdots 10, ..., en1=1000e_{n-1} = 10 \cdots 00 to get each bit of ss.
    • Total nn queries.
  • Quantum: use the exact same circuit as Deutsch-Jozsa.
    • Ufsx=(1)sxxU_{f_s} \ket{x} = (-1)^{s \cdot x} \ket{x}.
      • where we can ignore the output register in the \ket{-} state.
      • For n=1n =1, the state after the oracle before the final HH gate is:
        • 12(0+(1)s1)\frac{1}{\sqrt{2}} \left( \ket{0} + (-1)^{s} \ket{1} \right)
      • which is +\ket{+} if s=0s=0 and \ket{-} if s=1s=1.
    • Applying HH to this state returns ss:
      • H12(0+(1)s1)=sH \frac{1}{\sqrt{2}} \left( \ket{0} + (-1)^{s} \ket{1} \right) = \ket{s}.
    • The measurement deterministically reveals ss.

n=2 case

  • s=s1s0s = s_1s_0 and x=x1x0x = x_1 x_0
  • sx=s1x1+s0x0\rightarrow s\cdot x = s_1x_1 + s_0 x_0
12((1)s00+s1000+(1)s01+s1001+(1)s00+s1110+(1)s01+s1111)=12(00+(1)s001+(1)s110+(1)s1+s011).\begin{aligned} &\frac{1}{2}\big((-1) ^{s_0\cdot 0+s_1\cdot 0} \ket{00} + (-1) ^{s_0\cdot 1+s_1\cdot 0} \ket{01}+(-1) ^{s_0\cdot 0+s_1\cdot 1} \ket{10}+(-1) ^{s_0\cdot 1+s_1\cdot 1} \ket{11}\big)\\ &=\frac{1}{2}\big( \ket{00} + (-1) ^{s_0} \ket{01}+(-1) ^{s_1} \ket{10}+(-1) ^{s_1+s_0} \ket{11}\big). \end{aligned}

this factorized into:

12(0+(1)s11)12(0+(1)s01).\frac{1}{\sqrt{2}}\big(\ket 0 + (-1)^{s_1}\ket 1\big)\otimes \frac{1}{\sqrt{2}}\big(\ket 0 + (-1)^{s_0}\ket 1\big).

this reduces the same agument as n=1n=1 for each qubit where after the final HH gates, the state becomes:

s1s0s1s0.\ket{s_1}\otimes \ket {s_0} \equiv \ket{s_1 s_0}.

Implemnting Bernstein-Vazirani

Ufsxy=xy(sx)U_{f_s} \ket{x} \ket y = \ket{x} \ket {y\oplus (s\cdot x)} y(sx)=ys0x0s1x1sn1xn1y\oplus (s\cdot x) = y\oplus s_0 x_0 \oplus s_1 x_1 \oplus \cdots \oplus s_{n-1} x_{n-1} Ufsxy=xysxU_{f_s} \ket{x} \ket y = \ket{x} \ket {y\oplus s x}
  • I\mathbb{I} if s=0s = 0 and CNOTCNOT if s=1s = 1.
  • To implement fs(x)=sxf_{s}(x) = s \cdot x, we can use a CNOT from qubit ii to the scratch qubit if si=1s_i = 1.
import numpy as np
from qiskit import QuantumCircuit, transpile
from qiskit_aer import AerSimulator


def bv_query(n, secret=None):
# Build oracle for f_s(x) = s · x
# q[0]..q[n-1] = input register
# q[n] = scratch/output qubit

if secret is None:
value = np.random.randint(0, 2 ** n)
secret = format(value, f"0{n}b")
else:
secret = secret.zfill(n)

oracle = QuantumCircuit(n + 1, name="Uf")
for index, bit in enumerate(reversed(secret)):
if bit == "1":
oracle.cx(index, n)

return oracle, secret


def bernstein_vazirani_circuit(n, secret=None):
# Build full Bernstein-Vazirani circuit
oracle, secret = bv_query(n, secret)

qc = QuantumCircuit(n + 1, n)

# Prepare scratch qubit in |->
qc.x(n)

# Apply Hadamards to all qubits
for i in range(n + 1):
qc.h(i)

# Apply oracle
qc.compose(oracle, inplace=True)

# Apply final Hadamards to input register
for i in range(n):
qc.h(i)

# Measure input register
for i in range(n):
qc.measure(i, i)

return qc, secret


def run_bernstein_vazirani(n, secret=None, shots=1):
qc, secret = bernstein_vazirani_circuit(n, secret)

simulator = AerSimulator()
compiled = transpile(qc, simulator)
result = simulator.run(compiled, shots=shots).result()
counts = result.get_counts()

measured = max(counts, key=counts.get)
recovered = measured[::-1]

return qc, secret, recovered, counts


# Example
qc, secret, recovered, counts = run_bernstein_vazirani(5, "01101")
print(qc.draw())
print("Secret :", secret)
print("Measured :", recovered)
print("Counts :", counts)
OPENQASM 2.0;
qreg q[6];
creg c[5];

x q[5];
h q[0];
h q[1];
h q[2];
h q[3];
h q[4];
h q[5];

cx q[0],q[5];
cx q[2],q[5];
cx q[3],q[5];

h q[0];
h q[1];
h q[2];
h q[3];
h q[4];

measure q[0] -> c[0];
measure q[1] -> c[1];
measure q[2] -> c[2];
measure q[3] -> c[3];
measure q[4] -> c[4];

Summary

  • There are 22n2^{2^n} possible Boolean functions from {0,1}n\{0,1\}^n to {0,1}\{0,1\}.
  • A Boolean function is balanced if it outputs 1 on exactly half of the inputs, that is, on 2n12^{n-1} out of the 2n2^n possible inputs.
  • In Deutsch’s algorithm with n=1n=1, only 1 quantum query is needed to distinguish a constant function from a balanced function.
  • A quantum oracle for ff is defined as the unitary Uf: xyxyf(x).U_f:\ |x\rangle|y\rangle \mapsto |x\rangle|y\oplus f(x)\rangle.
  • Every XOR-based function of the form f(x)=xjxkf(x)=x_j\oplus x_k\oplus\cdots that depends on at least one input bit is balanced.
  • Using multi-controlled XX gates, together with anti-controls when needed, we can implement an oracle for any Boolean function.
  • The phase kickback multiplies the state by the phase factor (1)f(x)(-1)^{f(x)}, so the phase changes exactly when f(x)=1f(x)=1.
  • If these questions were stored in a Python list called questions, then the 8th question would be questions[7].
  • Applying Hadamard gates to nn qubits initialized in 0|0\rangle produces an equal superposition over all 2n2^n computational basis states.
  • In Deutsch–Jozsa for n>1n>1, measuring all input qubits as 0 means that ff is constant.
  • Bernstein–Vazirani solves the hidden-string problem f(x)=sxf(x)=s\cdot x with 1 quantum query.
  • A multi-controlled XX gate with 3 controls can be implemented using scratch qubits and 4 Toffoli gates.
  • Applying Uf1U_{f_1} followed by Uf2U_{f_2} yields an oracle whose action on the scratch qubit corresponds to f1(x)f2(x)f_1(x)\oplus f_2(x).
  • Uncomputation resets scratch qubits to their initial states by applying the inverse of the computation, which means reversing the order of the steps.
  • Multi-controlled gates with more than one control can be decomposed into simpler gates, but in general this requires more than just CXCX and XX; single-qubit gates are also needed.

CNN 006

· 4 min read

Data Preparation

  • Small Dataset (Range: 100 to 100,000 samples)
    • Train/Valid/Test: 60/20/20
    • Train/Test: 70/30
  • Large Dataset (Range: 500,000 to 1M+ samples)
    • Train/Valid/Test: 98%/10,000/10,000
    • usaully more traning data is get better performance.
  • Rule of Thumb: Validation and Test set should com from the same distribution.

Bias and Variance

  • Bias: A value that allows to shift the activation function to left or right to better fit the data.
    • With bias the curve/line will not always pass through origin
    • can get a better fit to training data
  • Variance: The sensitivity of the model to small fluctuations in the training data.
    • The change in prediction accuracy of ML model between training data and test data.
    • Model with high variance pays a lot of attention to tranining data and does not generalize on the data which is has not seen before.
    • With high variance, model perform very well on training data but poorly on test data.

Bias and Variance

  • High Bias
    • High training error, underfitting
    • Validation/test error nearly same as train error
    • Potential things to try:
      • Increase features
      • Make ML model more complicated
      • Decrease Regularation parameters
  • High Variance
    • Low tranining error, overfitting
    • High validation/test error
    • Potential things to try:
      • Increase dataset size
      • Reduce input features
      • Increasing Regularization parameter

Accuracy

  • Bayesian Optimal Error (BOE): Best optimal error that can be achieved by any model on a given dataset.
  • Human-Level performance:
    • Humans are very good at a lot of tasks
    • Can get labelled data from humans to help improve the model performance
    • Gain insights from manual error analysis

Regularization

  • a technique which makes slight modifications to the learning algorithm such that the model generalizes better on the unseen data.
  • Update the loss/cost function by adding a regularization term
    • Loss function=Loss+Regularization term(λ)\text{Loss function} = \text{Loss} + \text{Regularization term}(\lambda)
    • Due to λ\lambda, the weight matrices will decrease, assuming a neural network with smaller weight matrices leads to simpler model.
    • Regularization penalizes the weights matrices of the nodes
  • L2 regularization
  • L1 regularization
  • Dropout

L2 Regularization

Cost function=Loss+λ2mj=1nxwj2\text{Cost function} = \text{Loss} + \frac{\lambda}{2m} \sum_{j=1}^{n_x} w_j^2

  • λ\lambda is a hyper-parameter
  • as weight decay, as it forces the weight to decay towards zero, but not exactly zero.

L1 Regularization

Cost function=Loss+λ2mj=1nxwj\text{Cost function} = \text{Loss} + \frac{\lambda}{2m} \sum_{j=1}^{n_x} |w_j|

  • Penalize the absolute value of the ww
  • Weight may reduce to zero
  • Useful in compressing a model (sparse model)

Dropout

  • It produces good reuslts and most popular regularization technique in deep learning.
  • At every iteration, it randomly selects and drops some nodes and remove all the connections to those nodes.
  • Each iteration has a different set of nodes.

Data Augmentation

  • Simple way to reduce overfitting is to increase size of tranining dataset.
  • By creating more sample using the existing set and applying the following simple operations
    • Flip
    • Rotate
    • Scale
    • Crop
    • Translate
    • Gaussian Noise

Cutout

  • Simple regularization technique of randomly masking out square regions of input during training.
  • Patch size: 16x16 to 64x64
  • Fill value: 0 or mean pixel value
  • Patches: 1-3 per image

Mixup

  • Trains a neural network on convex combinations of pairs of examples and their labels.
  • It regularizes the neural network to favor simple lienar behavior in-between training examples.
  • Image A (λ=0.55\lambda = 0.55) + Image B (λ=0.45\lambda = 0.45) = Blended Output

CutMix

  • Patches are cut and pasted among training images, where the ground truth labels are also mixed proportionally to the area of the patches.
  • Image A + Image B (Patch) = Pasted Patch Output.

Random Agumentation

  • A set of augmentation operations is defined, and a random subset of these operations is applied to each image during training.
  • Identity, AutoContrast, Equalize, Rotate, Solarize, Color, Posterize, Contrast, Brightness, Sharpness, ShearX/Y, TranslateX/Y

Generative Adversarial Networks (GANs)

  • Able to generate images which look similar to the original ones
  • Proven to be very effective in data augmentation, especially when the dataset is small.

Neural Style Transfer

  • Using CNN to separate style
  • Transfer style to different image

리서치 가이드

· One min read

주제 선정

  • 박사과정 4년 중 1.5년은 읽기만 해야한다.
  • Area of Interest (AOI)는 내가 좋아하는 걸 해야 버틸 수 있다.
  • 그리고 그 1.5년 뒤에는 Research Question이 나올 것이다.
  • 4년 내의 논문에서 Future Work를 취합하자.

좋은 연구 주제

  • Government
    • Originality of content, document
    • How can differntiate from generated content?

Master leading to PhD

논문

  • Abstract가 흥미있으면 Conclusion을 읽고, 그 다음 관련 내용이면 전체를 읽어야한다.
  • 공식을 처음부터 이해하려고 하지 말자.
  • 2022년 이후 (4개년까지만) 논문을 읽어야한다.
    • 왜냐면 박사과정이 4-5년이니까.

CNN 005

· 3 min read

Computer Vision

  • Classification
  • Classification with Localization
  • Object Detection
-ANNCNN
Input1D vector3D tensor (height, width, channels)
ConnectionsFully connectedLocal connections (receptive fields)
OverfittingProne to overfittingLess prone to overfitting

Convolutional Neural Networks (CNN)

  1. Convolutional Layer (CONV)
  2. Pooling Layer (POOL)
  3. Fully Connected Layer (FC)

LENET-5

Convulutional Layer (CONV)

  • The first layer to extraact features from an input image
  • Core buildling block of a CNN
  • Convolutions are basic operation in this layer
  • A number of filters (e.g. edge detectors) are applied to the input image.

Padding

  • Padding is used to control the spatial size of the output feature maps.
  • Negative values at the edges can naturally arise because of padding, and they usually are not a big problem because activation functions and later layers come afterward.
  • Input Matrix dimension: n×n×cn \times n \times c (height, width, channels)
  • Filter size: f×ff \times f
  • Padding (PP): 1, number of pixels added to the border of the input
  • (n×n)(f×f)(n+2Pf+1)×(n+2Pf+1)(n \times n) * (f \times f) \to (n + 2P - f + 1) \times (n + 2P - f + 1)
    • Example: 5×55 \times 5 input with 3×33 \times 3 filter and padding of 1 results in a 5×55 \times 5 output feature map.
  • if input and output matrix dimensions are the same, then P=f12P = \frac{f - 1}{2}.
  • Valid padding (P=0P = 0): No Padding
  • Same padding (P=f12P = \frac{f - 1}{2}): Output size and input size is same, this requires appropriate padding.

Stride

  • It is the number of pixels by which slide the filter across the input image.
No Padding StridesStride with Padding
no padding stridesstride with padding
  • Github: vdumoulin/conv_arithmetic
  • Input Matrix dimension: n×nn \times n
  • Filter size: f×ff \times f
  • Padding: PP
  • Stride: SS
  • Output Size = n+2PfS+1×n+2PfS+1 \left\lfloor \frac{n + 2P - f}{S} + 1 \right\rfloor \times \left\lfloor \frac{n + 2P - f}{S} + 1 \right\rfloor
    • Example: Input Matrix dimension: 5×55 \times 5, Filter size: 3×33 \times 3, Padding: 11, Stride: 22 results in an output size of 2×22 \times 2.

Pooling Layer (POOL)

  • Down sampling operation which reduces the dimensionality of a matrix.
  • Reduces the number of parameters for large image, but retain the valuable information.
  • Max pooling
  • Average pooling
  • Sum pooling

Fully Connected Layer (FC)

  • a traditional Multi-layer Perception (MLP) layer
  • For multi-class classification, usually Softmax activation is used.
  • Softmax ensures the output.
  • Output of the CONV and POOL layers represent a high level features of the Input image.
  • The FC layer takes these features to classify the input image into the desired output classes.

CNN 004

· 6 min read

Logistic Regression as Neural Network

  • if y=1y = 1
    • L=log(y^)L = -\log(\hat{y})
    • if y^1\hat{y} \to 1, then L0L \to 0 (low loss)
    • if y^0\hat{y} \to 0, then LL \to \infty (high loss)
  • if y=0y = 0
    • L=log(1y^)L = -\log(1 - \hat{y})
    • if y^0\hat{y} \to 0, then L0L \to 0 (low loss)
    • if y^1\hat{y} \to 1, then LL \to \infty (high loss)

Gradient Descent

  • it is an iterative approach for error correction in a machene learning model
  • Find ww and bb that will minimize GD(w,b)GD(w, b) (requires Loss/Cost function)
  1. Initialize ww and bb
  2. Perform Forward pass operation/calculations
  3. Compute Loss/Cost function L(a,y)L(a, y)
  4. Compute change in ww and bb (Take the partial derivative of the cost function with respect to Weights and bias dwdw and dbdb)
  5. Update ww and bb (w:=wαdww := w - \alpha dw and b:=bαdbb := b - \alpha db)
  6. Repeat from Step 2 with new values of ww and bb for 'n' number of iterations.
  • α\alpha is the learning rate (hyperparameter) that controls how much we are adjusting the weights and bias of our model with respect to the loss gradient. It is a small positive value (e.g., 0.01, 0.001) that determines the step size at each iteration while moving toward a minimum of the loss function.

Gradient Descent Types

  • Batch Gradient Descent (BGD)
  • Stochastic Gradient Descent (SGD)
  • Mini-batch Gradient Descent (MBGD)

Batch Gradient Descent (BGD)

  1. Process each input sample and find the cost
  2. Find the average cost oveer all input samples
  3. Update ww and bb and repeat the steps for "n" epochs(iterations)
  • Disadvantages:
    • It uses the complete dataset to calculate the gradients at every steps
    • Slow when training data is large
    • Difficult to find the learning rate
    • Difficult to ascertain the number of epochs(iterations)

Stochastic Gradient Descent (SGD)

Due to the random nature, the algorithm is much less regular than BGD.

  1. Process a random input sample and find the cost.
  2. Update ww and bb, and repeat the steps for "n" iterations on the training samples.
  • Advantages:
    • Computes gradient based on single input sample, which is memory efficient.
    • Much faster compared to BGD.
    • Possible to train on large datasets.
    • Randomness is helpful to escape local minima.
  • Disadvantages:
    • Might not reach the optimal value, but very close to it.
      • Simulated annealing: Reduce the learning rate gradually
      • Create a Learning Schedule to determine the learning rate at each iteration.

Mini-batch Gradient Descent (MBGD)

  1. Divide the tranining set into mini-batches of size nn (e.g., 64, 128, 256).
  2. Process all the samples in a mini-batch and find the average cost
  3. Update ww and bb, and repeat the steps for "n" iterations/epoches on the traning samples.
  • Advantages:
    • Computes gradient based on small sets of input smaple
    • Much faster compared to BGD.
    • Possible to train on large dataset.
    • Performance boost on matrix operations using GPUs.
    • Might not reach the optional value but, very close to it and possibly better than SGD.
  • Disadvantages:
    • It may be harder to escape the local minima compared to SGD.

GD

Exponentially Weighted Averages

  • One of the popular algorithm for smoothing sequential data (time series data), aka. moving average.
  • Weight the number of observations and using their average
V0=0V1=0.9V0+0.1θ1V2=0.9V1+0.1θ2V3=0.9V2+0.1θ3Vt=0.9Vt1+0.1θtVt=βVt1+(1β)θtV_0 = 0 \\ V_1 = 0.9 \cdot V_0 + 0.1 \cdot \theta_1 \\ V_2 = 0.9 \cdot V_1 + 0.1 \cdot \theta_2 \\ V_3 = 0.9 \cdot V_2 + 0.1 \cdot \theta_3 \\ \vdots \\ V_t = 0.9 \cdot V_{t-1} + 0.1 \cdot \theta_t \\ V_t = \beta \cdot V_{t-1} + (1 - \beta) \cdot \theta_t

VtV_t is approximate average over 11β\approx \frac{1}{1 - \beta} time steps.

  • For β=0.9\beta = 0.9, VtV_t is average over the last 10 time steps.
  • For β=0.98\beta = 0.98, VtV_t is average over the last 50 time steps.
  • For β=0.5\beta = 0.5, VtV_t is average over the last 2 time steps.

Optimizers

SGD with Moementum

At iteration tt:

  • Calculate dwdw and dbdb on the current mini-batch (Hyper parameters: α\alpha and β\beta)
  • Update the velocity:
    • Vdw=βVdw+(1β)dwVt=βVt1+(1β)θtV_{dw} = \beta V_{dw} + (1 - \beta) dw \rightarrow V_t = \beta V_{t-1} + (1 - \beta) \theta_t
    • Vdb=βVdb+(1β)dbV_{db} = \beta V_{db} + (1 - \beta) db
  • Update parameters:
    • w:=wαVdww := w - \alpha V_{dw}
    • b:=bαVdbb := b - \alpha V_{db}

RMSProp

  • Root Mean Square Propagation.
  • Unpublished adaptive learning method by Geoffrey Hinton.
  • Reduces oscillation but in a different way than Momentum.
  • Divides the learning rate by an exponentially decaying average of squared gradients.
  • Calculate dwdw and dbdb on the current mini-batch
    • Sdw=βSdw+(1β)dw2S_{dw} = \beta S_{dw} + (1 - \beta) dw^2
    • Sdb=βSdb+(1β)db2S_{db} = \beta S_{db} + (1 - \beta) db^2
  • Update parameters:
    • w:=wαdwSdw+ϵw := w - \alpha \frac{dw}{\sqrt{S_{dw}} + \epsilon}
    • b:=bαdbSdb+ϵb := b - \alpha \frac{db}{\sqrt{S_{db}} + \epsilon}
    • ϵ\epsilon is a small number to prevent division by zero (e.g., 108 to 101010^{-8} \text{ to } 10^{-10})

Adam

  • Adaptive Moment Estimation
  • Combination of RMSProp and Momentum
  • Work well for a wide range of non-convex optimization problems in machine learning.
  • Calculate dwdw and dbdb on the current mini-batch
    • Vdw=β1Vdw+(1β1)dwMomentum,β1V_{dw} = \beta_1 V_{dw} + (1 - \beta_1) dw \leftarrow Momentum, \beta_1
    • Vdb=β1Vdb+(1β1)dbV_{db} = \beta_1 V_{db} + (1 - \beta_1) db
    • Sdw=β2Sdw+(1β2)dw2RMSProp,β2S_{dw} = \beta_2 S_{dw} + (1 - \beta_2) dw^2 \leftarrow RMSProp, \beta_2
    • Sdb=β2Sdb+(1β2)db2S_{db} = \beta_2 S_{db} + (1 - \beta_2) db^2
  • Update parameters:
    • w:=wαVdwSdw+ϵw := w - \alpha \frac{V_{dw}}{\sqrt{S_{dw}} + \epsilon}
    • b:=bαVdbSdb+ϵb := b - \alpha \frac{V_{db}}{\sqrt{S_{db}} + \epsilon}
    • ϵ\epsilon is a small number to prevent division by zero (e.g., 108 to 101010^{-8} \text{ to } 10^{-10})
  • Hyper parameter guide:
    • α=0.001\alpha = 0.001
    • β1=0.9\beta_1 = 0.9: Momentum term
    • β2=0.999\beta_2 = 0.999: Moving weighted average
    • ϵ=108\epsilon = 10^{-8}: To prevent division by zero
  • ensmallen.org

Learning Rate Decay

  • Speed-up the learning algorighm by slowing decreasing the learning rate α\alpha as the number of epochs increases.

Activation Functions

  • Getting the output of a layer in a neural network and applying a non-linear function to it.
    • Sigmoid: σ(x)=11+ex\sigma(x) = \frac{1}{1 + e^{-x}}
    • Tanh: tanh(x)=21+e2x1\tanh(x) = \frac{2}{1 + e^{-2x}} - 1
    • Used for binary classification in the output layer.
  • ReLU: A(x)=max(0,x)A(x) = \max(0, x)
    • Rectified Linear Unit
    • Avoids and rectifies vanishing gradient problem
    • Best used in hidden layers
    • Computationally less expensive than sigmoid and tanh
  • Softmax: S(xi)=exijexjS(x_i) = \frac{e^{x_i}}{\sum_{j} e^{x_j}}
    • Turns numbers in probabilities that sum to 1.
    • Used for multi-class classification in the output layer.

IQC 005

· 10 min read

Encoding

  • Basis encoding: binary srings -> computational basis states
  • Amplitude encoding: data values -> amplitudes of a quantum state
  • Angle encoding: data values -> rotation angles on individual qubits

Basis Encoding

x=b1b2...bnx=b1b2...bnx = b_1 b_2 ... b_n \rightarrow \ket{x} = \ket{b_1 b_2 ... b_n}

  • applying XX gates to flip qubits from 0\ket{0} to 1\ket{1} based on the binary representation of the data.
  • 0001\ket{00} \rightarrow \ket{01} (X gate on the second qubit)
  • 0010\ket{00} \rightarrow \ket{10} (X gate on the first qubit)
  • Dataset superposition: to encode set {01,11}\{01, 11\}
    • S=12(01+11)\ket{S} = \frac{1}{\sqrt{2}}(\ket{01} + \ket{11})
    • it requires Hadamard and/or controlled gates
  • Hadamard transform
    • Hn0n=12nx=02n1x\ket{H^{\otimes n}}\ket{0}^{\otimes n} = \frac{1}{\sqrt{2^n}} \sum_{x=0}^{2^n-1} \ket{x}
import pennylane as qml

x = [int(b) for b in '1011']

def circuit(x):
qml.BasisEmbedding(features=x, wires=range(len(x)))

dev = qml.device('default.qubit', wires=4)
qnode = qml.QNode(circuit, dev)

print(qml.draw(qml.transforms.decompose(qnode), show_all_wires=True)(x))

Amplitude Encoding

x=[x0,x1,...,xN1]ψ=i=02N1xkkx = [x_0, x_1, ..., x_{N-1}] \rightarrow |\psi\rangle = \sum_{i=0}^{2^N-1} x_k |k\rangle

  • The most qubit-efficient encoding method, since nn qubits can encode 2n2^n amplitudes.
00001001201030114100510161107111\begin{align*} \ket{\bf 0} &\equiv \ket{000} \\ \ket{\bf 1} &\equiv \ket{001} \\ \ket{\bf 2} &\equiv \ket{010} \\ \ket{\bf 3} &\equiv \ket{011} \\ \ket{\bf 4} &\equiv \ket{100} \\ \ket{\bf 5} &\equiv \ket{101} \\ \ket{\bf 6} &\equiv \ket{110} \\ \ket{\bf 7} &\equiv \ket{111} \end{align*}
  • The state can be rewritten more explicitly (for 3 qubits)

ψ=x0000+x1001+x2010+x3011+x4100+x5101+x6110+x7111\ket{\psi} = x_0 \ket{000} + x_1 \ket{001} + x_2 \ket{010} + x_3 \ket{011} + x_4 \ket{100} + x_5 \ket{101} + x_6 \ket{110} + x_7 \ket{111}

  • For NN data points, we need n=log2(N)n = \log_2(N) qubits.
  • Converting to binary
    • with nn qubits, the computational basis states range from
      • 0\ket{0} to 2n1\ket{2^n - 1}
    • [x0,x1,...,xN1][x_0, x_1, ..., x_{N-1}] can be mapped to x0000+x1001++xN1(N1)x_0\ket{000} + x_1\ket{001} + \cdots + x_{N-1}\ket{(N-1)}.
    • bi=k2imod2b_i = \left\lfloor \frac{k}{2^i} \right\rfloor \bmod 2.
  • if k=5k = 5
    • b0=520mod2=1b_0 = \left\lfloor \frac{5}{2^0} \right\rfloor \bmod 2 = 1
    • b1=521mod2=0b_1 = \left\lfloor \frac{5}{2^1} \right\rfloor \bmod 2 = 0
    • b2=522mod2=1b_2 = \left\lfloor \frac{5}{2^2} \right\rfloor \bmod 2 = 1
    • gives b2b1b0=101b_2 b_1 b_0 = 101 (binary representation of 5)
    • 5=101\ket{\bf 5} = \ket{101}

Angle Encoding

  • Each classical value controls the roation angle of a qubit gate.
  • x=[x1,x2,...,xn] x = [x_1, x_2, ..., x_n]
  • RY(x0)0RY(x1)0...RY(xn1)0RY(x_0) \ket{0} \otimes RY(x_1) \ket{0} \otimes ... \otimes RY(x_{n-1}) \ket{0}
  • where RY(θ)=[cos(θ/2)sin(θ/2)sin(θ/2)cos(θ/2)]RY(\theta) = \begin{bmatrix} \cos(\theta/2) & -\sin(\theta/2) \\ \sin(\theta/2) & \cos(\theta/2) \end{bmatrix} is the YY rotation gate.

Summary of Encoding Methods

MethodQubit costEample Classical DataUse CaseCircuit Complexity
Basis Encoding1 qubit per bitBinary strings ('1011')Binary data, configsEasy
Amplitude Encodinglog2(N)\log_2(N) qubitsVector of real numbersQML, optimizationHard
Angle Encoding1 qubit per data pointVector of anglesVariational / hybrid algorithmsEasy
  • Pennylane: BasisEmbedding, AmplitudeEmbedding, AngleEmbedding
  • Qiskit: initialize
  • PyTKET: StatePreparationBox

Amplitude Encoding Circuit

  • The algorithm to prepare an arbitrary vector of length 2n2^n takes roughly that many gates. (e.g. 23102^3 \approx 10)
wires = [0, 1, 2]

def circuit(x):
qml.AmplitudeEmbedding(x, wires)

dev = qml.device("default.qubit", wires=wires)
qnode = qml.QNode(circuit, dev)

# Random vector of length 8 (for 3 qubits)
x = np.random.rand(8)
# Normalize the vector to have unit length (required for quantum states)
x = x/np.sqrt(np.sum(x**2))

qml.draw_mpl(qnode)(x);
qml.draw_mpl(qml.transforms.decompose(qnode), decimals=3)(x);

Amplitute encoding circuit

from pytket.circuit import StatePreparationBox
from pytket.circuit.display import render_circuit_jupyter as draw

state_circ = pytket.circuit.Circuit(3)

# Example 3-qubit state to prepare
w_state = 1 / np.sqrt(3) * np.array([0, 1, 1, 0, 1, 0, 0, 0])

w_state_box = StatePreparationBox(w_state)
state_circ.add_gate(w_state_box, [0, 1, 2])

draw(state_circ)

pytket.transform.Transform.DecomposeBoxes().apply(state_circ)
draw(state_circ)

Amplitute encoding circuit with PyTKET

Different algorithms for Rotation gate

  • Pennylane: Transformation of quantum states using uniformly controlled rotations Ry(θ)=eθY/2=[cos(θ/2)sin(θ/2)sin(θ/2)cos(θ/2)]Ry(\theta) = e^{-\theta Y / 2} = \begin{bmatrix} \cos(\theta/2) & -\sin(\theta/2) \\ \sin(\theta/2) & \cos(\theta/2) \end{bmatrix}

  • PyTKET: Synthesis of Quantum Logic Circuits Ry(θ)=eθπY/2=[cos(θπ/2)sin(θπ/2)sin(θπ/2)cos(θπ/2)]R_y(\theta) = e^{-\theta \pi Y / 2} = \begin{bmatrix} \cos(\theta \pi / 2) & -\sin(\theta \pi / 2) \\ \sin(\theta \pi / 2) & \cos(\theta \pi / 2) \end{bmatrix}

  • Qiskit: Quantum Circuits for Isometries

  • There are concreate algorithms for encoding classical data into quantum states, but this process is generally computationally expensive.

  • It usually requires classical preprocessing, and the resulting circuit can be quite deep.

  • In general, amplitude encoding an arbitary large vector is expensive.

  • "a quantum computer can store an exponential amount of data" should be always considered together with the cost of state preparation.

  • The real advantage of quantum computers does not appear for every problem, but only for specific problems that are well-suited to them.

    • Problems where state preparation is natural or efficient.
    • Problems that involve simulating quantum systems themselves.
    • Structured linear algebra, optimization, or sampling problems.
    • Problems where the input is already given as a quantum state.

Binary Logic Gates

absumcarry
0000
0110
1010
1101
  • Sum s=abs = a \oplus b (XOR)
  • Carry c=abc = a \cdot b (AND)
  1. Declare registers
  2. Apply opertions
  3. Read the result

Full-Adder in Binary Logic

  • Handles three inpus (aa, bb, and carry-in cinc_{in}) and produces two outputs (sum ss and carry-out coutc_{out}).
aabbcinc_{\text{in}}sscoutc_{\text{out}}
00000
00110
01010
01101
10010
10101
11001
11111
  • s=abcins = a \oplus b \oplus c_{in}
  • cout=(ab)(acin)(bcin)c_{out} = (a \cdot b) \oplus (a \cdot c_{in}) \oplus (b \cdot c_{in})
  • A full-adder = two half-adders + an OR gate
  • Chaning full-adders producs a ripple-carry adder for multi-bit numbers.

Quantum Arithmetic

  • All quantum gates must be Unitary (Invertable)
  • A classical half-adder discards input information after computing the sum and carry.
    • This irreversibility is forbidden in quantum circuits.
  • The computation must be done in a way that preserves all input information.
    • XOR and AND operations must be implemented using reversible gates (e.g. Toffoli gate).
    • XOR \rightarrow CNOT gate
    • AND \rightarrow Toffoli gate
Classical opQuantum gate
XOR (ab)(a \oplus b)CNOT(a,b)=aba\text{CNOT}(a, b) = \ket{a} \otimes \ket{b \oplus a}
AND (ab)(a \cdot b)CCXabc=ab(ab)c\text{CCX}\ket{a}\ket{b}\ket{c} = \ket{a} \ket{b} \ket{(a \cdot b) \oplus c}
  • With a third qubit initialized to 0\ket{0}, we can compute the AND of aa and bb without losing information about aa and bb.

Half-Adder

Half Adder with two work qubits

Half Adder with one work qubit

Full-Adder

Full Adder

Cout=(ab)(acin)(bcin)C_{out} = (a \cdot b) \oplus (a \cdot c_{in}) \oplus (b \cdot c_{in}) s=abcins = a \oplus b \oplus c_{in}

Reduced Full Adder

CCX(a,b,cout)CNOT(a,b)CCX(b,cin,cout)CNOT(b,cin)CCX(a, b, c_{out}) \rightarrow CNOT(a, b) \rightarrow CCX(b, c_{in}, c_{out}) \rightarrow CNOT(b, c_{in})

Ripple Carry Adder

  • Two full-adder circuits in sequence, overlapping on the carry qubit ("rippling" the carry through the circuit).
  • CDKM adder
    • Carry-out is the majority vote of the three inputs

MAJ circuit

  • MAJ: Majority, computes carry in-place using only 2 CNOTs + 1 Toffoli

    • (c,b,a)(ca,ba,abacbc)(c, b, a) \rightarrow (c \oplus a, b \oplus a, a \cdot b \oplus a \cdot c \oplus b \cdot c)
  • UMA: UnMajority-and-Add, reverses MAJ but overwrites bb with the sum.

    • ca,ba,coutc,abc,sc \oplus a, b \oplus a, c_{out} \rightarrow c, a\oplus b\oplus c, s

UMA circuit

  • CDKM Ripple-Carry Adder: For two n-bit numbers, chain MAJ/UMA pairs, sharing the carry qubit.
    • CDKMRippleCarryAdder

CDKM circuit

Quantum Multiplication

-------
110
x101
110
000
+110
11100
  • Binary multiplication is "shift and add"
    • for each 1-bit in the multiplier, add a shifted version of the multiplicand to the result.
    • for each 0-bit, add nothing (or add a zero vector).
  • In quantum, each partial addition is a conditional adder , every internal gate is controlled by a bit of the multiplier.

CU=00I+11UCU = \ket{0}\bra{0} \otimes I + \ket{1}\bra{1} \otimes U

Superposition

  • Same adder circuit works on superposition of inputs

ADDER2(1+3)0=2(13)(35)\text{ADDER} \ket{2} (\ket{1} + \ket{3}) \ket{0} = \ket{2} (\ket{1} \ket{3})(\ket{3} \ket{5})

  • With one adder, both 2+12 + 1 and 2+32 + 3 are computed simultaneously.
  • measurement collapses the superposition, we can only ever see one result per run. Either 2+1=32 + 1 = 3 or 2+3=52 + 3 = 5.
  • Amplifying the probability of amplitudes corresponding to correct answers is a key part of quantum algorithms.

Summary

  • Basis encoding maps each classical bit to a qubit, with 0 mapped to 0|0\rangle and 1 mapped to 1|1\rangle.
  • To encode 1011, we need X gates on qubits 0, 2, and 3 (assuming qubit 0 is the most significant bit).
  • Applying an H gate (Hadamard gate) to each qubit puts the system in a uniform superposition over all basis states.
  • Amplitude encoding of an arbitrary vector of size 2n2^n generally requires O(2n)O(2^n) gates.
  • In angle encoding, each classical value xix_i becomes the rotation angle of an RY gate on the ii-th qubit.
  • Using the binary conversion formula, the decimal number 5 maps to 101|101\rangle for 3 qubits.
  • In short, in quantum addition, the sum is an XOR operation, and the carry is an AND operation.
  • A quantum half-adder must keep the original inputs to remain reversible and maintain its unitary nature.
  • The Toffoli gate is defined as: abcabcab|a\rangle|b\rangle|c\rangle \mapsto |a\rangle|b\rangle|c \oplus a \cdot b\rangle.
  • A full-adder should produce a carry-out as follows: cout=abacinbcinc_{\text{out}} = a \cdot b \oplus a \cdot c_{\text{in}} \oplus b \cdot c_{\text{in}}.
  • The following operation is reversible: abcaabcab|a\rangle|b\rangle|c\rangle \mapsto |a\rangle|a \oplus b\rangle|c \oplus a \cdot b\rangle.
  • For addition "in superposition," measuring the output register yields exactly one of the partial sums, chosen probabilistically.
  • The MAJ circuit computes the majority function as: abacbca \cdot b \oplus a \cdot c \oplus b \cdot c.
  • The QASM snippet for a half-adder uses ccx q[0], q[1], q[2] to compute the carry (aba \cdot b) and a cx q[0], q[1] (or cx q[1], q[0]) gate to compute the sum (aba \oplus b).
  • Quantum multiplication can be implemented via conditional addition, one per multiplier bit.

IQC 004

· 10 min read

Superdense Coding

  • counterintuitive protocol that allows us to send 2 bits of classical information by sending only 1 qubit, using pre-shared entanglement
  • it's often contextualized as a game with Alice and Bob.
  • Alice and Bob share and entangled pair of qubits
  • Pre-shared Entanglement (Shared Bell pair)
    • Φ+=12(00+11)|\Phi^+\rangle = \frac{1}{\sqrt{2}}(|00\rangle + |11\rangle)
  • Alice wants to send 2 bits of classical information (00, 01, 10, or 11) to Bob
    • For 00: Apply Identity II (do nothing)
    • For 01: Apply Pauli-X XX (bit flip)
    • For 10: Apply Pauli-Z ZZ (phase flip)
    • For 11: Apply XZXZ (bit and phase flip)
  • Alice sends her one qubit to Bob (Bob possesses both qubits of the entangled pair)
  • Bob decodes the information
    • Inverts the entanglement operation on the two qubits (CNOT followed by Hadamard)
    • Measures each of the qubits
    • The outcome of this measurement reveals the two bits Alice encoded.
  • Φ+|\Phi^+\rangle: is the most common and convenient form (Bell State) of entanglement and is also maximally entangled state despite single-qubit unitary gates.
    • Easy to build circuits for it
    • Symmetric and has nice properties that make it ideal for protocols like superdense coding and teleportation
    • Other Bell states can be generated from Φ+|\Phi^+\rangle by applying single-qubit gates.

Superdense Coding Circuit

Superdense Coding Circuit

  • Alice's message mnmn, where each of mm and nn is a bit (0 or 1)
  • Alice's operation can be represented as ZmXnIΦ+Z^m X^n \otimes \mathbb{I} | \Phi^+\rangle
    • m=0Z0=Im=0 \Rightarrow Z^0 = I (no phase flip)
    • m=1Z1=Zm=1 \Rightarrow Z^1 = Z (phase flip)
    • n=0X0=In=0 \Rightarrow X^0 = I (no bit flip
    • n=1X1=Xn=1 \Rightarrow X^1 = X (bit flip)
    • 00I00 \rightarrow I
    • 01X01 \rightarrow X
    • 10Z10 \rightarrow Z
    • 11ZX11 \rightarrow ZX
  • Zmb=(1)mbbZ^m|b\rangle = (-1)^{mb}|b\rangle (phase flip)
  • Xna=anX^n|a\rangle = |a \oplus n\rangle (xor operation)

Xn(12(00+11))=12(n0+(1n)1)X^n \left(\frac{1}{\sqrt{2}} (|00\rangle + |11\rangle)\right) = \frac{1}{\sqrt{2}} \left( |n0\rangle + |(1\oplus n)1\rangle \right)

  • n=0n=0: 12(00+11)=Φ+\frac{1}{\sqrt{2}} (|00\rangle + |11\rangle) = |\Phi^+\rangle
  • n=1n=1: 12(10+01)\frac{1}{\sqrt{2}} (|10\rangle + |01\rangle)

ZmXn(12(00+11))=12((1)mnn0+(1)m(1n)(1n)1)Z^m X^n \left(\frac{1}{\sqrt{2}} (|00\rangle + |11\rangle)\right) = \frac{1}{\sqrt{2}} \left( (-1)^{mn}|n0\rangle + (-1)^{m(1\oplus n)}|(1\oplus n)1\rangle \right)

  • n0|n0\rangle: first qubit is nn, (1)mn(-1)^{mn} is the phase factor
  • (1n)1|(1\oplus n)1\rangle: first qubit is 1n1\oplus n, (1)m(1n)(-1)^{m(1\oplus n)} is the phase factor
  • Bob's decoding operation is the inverse of the entanglement operation
    • CNOTab=aabCNOT |a\rangle |b\rangle = |a\rangle |a \oplus b\rangle
CNOTZmXn(12(00+11))=(1)mn2(nn+(1)m(1n)n)=(1)mn2(n+(1)m1n)n\begin{aligned} CNOT\, Z^m X^n \left(\frac{1}{\sqrt{2}} (|00\rangle + |11\rangle)\right) &= \frac{(-1)^{mn}}{\sqrt{2}} \left(|nn\rangle + (-1)^m|(1 \oplus n)n\rangle\right) \\ &= \frac{(-1)^{mn}}{\sqrt{2}} \left(|n\rangle + (-1)^m|1 \oplus n\rangle\right)|n\rangle \end{aligned}
  • which is seperable state.
  • First qubit is n+(1)m1n|n\rangle + (-1)^m|1 \oplus n\rangle and second qubit is n|n\rangle.
  • n,m{0,1}n, m \in \{0, 1\}, then apply Hadamard to the first qubit:
H12(n+(1)m1n)=(1)mnmH\, \frac{1}{\sqrt{2}} \left(|n\rangle + (-1)^m|1 \oplus n\rangle\right) = (-1)^{mn}|m\rangle
  • second qubit is n|n\rangle, so the final state is:

(1)mn(1)nmmn=mn(-1)^{mn}(-1)^{nm}|m\rangle|n\rangle = |mn\rangle

Superdense QASM

OPENQASM 2.0;
qreg q[2];
creg c[2];

// Prepare Bell pair
h q[0];
cx q[0],q[1];

// Alice's encoding
// For message 11: apply X and Z
x q[0];
z q[0];

// Alice sends q[0] to Bob (in simulation, we just proceed)

// Bob's decoding
cx q[0],q[1];
h q[0];

// Measure both qubits
measure q[0] -> c[0];
measure q[1] -> c[1];

QASM Programming

Custom Gates

// params: parameters for the gate (e.g., rotation angles)
// q_args: quantum arguments (qubits the gate acts on)
gate NAME(parameters) q_args {
// Define gate operations
}

gate bell a,b {
h a;
cx a,b;
}

qreg q[2];
creg c[2];

// Use the custom bell gate
bell q[0], q[1];

// Measure the qubits
measure q[0] -> c[0];
measure q[1] -> c[1];

Toffoli Gate

// Toffoli gate (CCX)
gate ccx a, b, c {
h c;
cx b, c;
tdg c;
cx a, c;
t c;
cx b, c;
tdg c;
cx a, c;
t b;
t c;
cx a, b;
h c;
t a;
tdg b;
cx a, b;
}

Rotation Gates

// Rotation around Y-axis
gate ry_deg(theta) q {
ry(theta/180 * pi) q;
}

// Rotation around X-axis
gate rx_deg(theta) q {
rx(theta/180 * pi) q;
}

Qiskit

IBM

import qiskit

superdense = qiskit.QuantumCircuit(2, 2)
superdense.draw()

superdense.h(0)
superdense.cx(0, 1)
superdense.draw()

superdense qiskit

# Alice's encoding for message '11'
superdense.x(0)
superdense.z(0)
superdense.draw()

superdense qiskit 11

# Bob unentangles the two qubits (reverses the entangling gate)
superdense.cx(0,1)
superdense.h(0)

# The measurement pattern is `measure(qubit to measure, classical bit to store result)`
superdense.measure(0,0)
superdense.measure(1,1)
superdense.draw()

superdense qiskit measure

from qiskit.providers.basic_provider import BasicSimulator

sim = BasicSimulator()
# run the circuit on the simulator with 1 shot (execute the circuit once)
result = sim.run(superdense, shots=1).result().get_counts()

print(result)
# {'11': 1}

Custom Gates in Qiskit

bell = qiskit.QuantumCircuit(2, name='bell')
bell.h(0)
bell.cx(0, 1)
bell_gate = bell.to_gate()

c = qiskit.QuantumCircuit(2)
c.append(bell_gate, [0, 1])
c.draw()

Decompose a custom gate

ccxgate = qiskit.circuit.library.CCXGate()
ccx = qiskit.QuantumCircuit(3)
ccx.append(ccxgate, [0, 1, 2])

print(ccx)
print(ccx.decompose())

ccx

Cirq

Google

import cirq

alice = cirq.NamedQubit('Alice')
bob = cirq.NamedQubit('Bob')

superdense = cirq.Circuit()
superdense.append([
cirq.H(alice),
cirq.CNOT(alice, bob),
])

superdense.append([
cirq.X(alice),
cirq.Z(alice),
])

superdense.append([
cirq.CNOT(alice, bob),
cirq.H(alice),
])

superdense.append(cirq.measure(alice, bob, key='received'))

print(superdense)
simulator = cirq.Simulator()
result = simulator.run(superdense, repetitions=1)
print(result)
# received=1, 1

Pennylane

Xanadu

import pennylane as qml

def entangle():
qml.Hadamard(wires=0)
qml.CNOT(wires=[0, 1])

print(qml.draw(entangle)())

device = qml.device("default.qubit", wires=[0, 1])

# Wrap the quantum function as a QNode
entangle_qnode = qml.QNode(my_circuit, device)

# Set the number of shots to 10
entangle_qnode = qml.set_shots(entangle_qnode, shots=10)
import numpy as np

state = np.array([1, 1j], dtype=complex)

state = state / np.linalg.norm(state)
def teleport(state):
# Ensure "state" is loaded into the first qubit
# (otherwise qubits would start in |0>)
qml.StatePrep(state, wires=0)

# Shared entanglement between qubits 1 and 2
qml.Hadamard(wires=1)
qml.CNOT(wires=[1, 2])

# Alice's operation:
# CNOT from Alice's input qubit to her half of the Bell pair,
# then Hadamard on the input qubit
qml.CNOT(wires=[0, 1])
qml.Hadamard(wires=0)

# Measurement (store the classical outcomes)
m0 = qml.measure(0)
m1 = qml.measure(1)

# Bob's conditional correction operations
qml.cond(m1, qml.PauliX)(wires=2)
qml.cond(m0, qml.PauliZ)(wires=2)

# Return Bob's qubit as a density matrix
return qml.density_matrix(wires=2)

print(qml.draw(teleport)(state))

Quantum Teleportation

Quantum Teleportation

-Superdense CodingTeleportation
ConsumesEntanglementEntanglement
Sends1 qubit2 bits
Transmits2 bits1 qubit
  • if there is entanglement:
    • we can use it to send 2 bits of classical information by sending 1 qubit (superdense coding)
    • we can use it to send 1 qubit of quantum information by sending 2 bits of classical information (teleportation)
  • Alice wants to send a qubit ψ=α0+β1|\psi\rangle = \alpha|0\rangle + \beta|1\rangle to Bob, but only has a classical channel.

Protocol

  1. Alice and Bob share Φ+|\Phi^+\rangle (pre-shared entanglement)
  2. Alice applies CNOTCNOT (her qubit -> her Bell half), then HH, then measures both, obtaining 2 classical bits (00, 01, 10, or 11)
  3. Alice sends mnmn classically to Bob
  4. Bob applies XnZmX^n Z^m to his qubit, recovering ψ|\psi\rangle
  • For 00: Apply Identity II (do nothing)
  • For 01: Apply Pauli-X XX (bit flip)
  • For 10: Apply Pauli-Z ZZ (phase flip)
  • For 11: Apply XZXZ (bit and phase flip)
  • Bob now has a qubit in the state Alice wanted to send ψ|\psi\rangle without Alice ever sending a qubit directly to Bob.

Teleportation Circuit

Teleportation Circuit

ψΦ+=(α0+β1)12(00+11)|\psi\rangle \otimes |\Phi^+\rangle = \left(\alpha |0\rangle + \beta |1\rangle\right) \otimes \frac{1}{\sqrt{2}} \left(|00\rangle + |11\rangle\right)

ψΦ+=12(α000+α011+β100+β111)|\psi\rangle \otimes |\Phi^+\rangle = \frac{1}{\sqrt{2}} \left(\alpha |0\rangle |00\rangle + \alpha |0\rangle |11\rangle + \beta |1\rangle |00\rangle + \beta |1\rangle |11\rangle\right)

  • Alice applies a CNOTCNOT gate with her qubit ψ|\psi\rangle as control and her half of the Bell pair Φ+|\Phi^+\rangle as target.
    • β100\beta |1\rangle |00\rangle becomes β110\beta |1\rangle |10\rangle (target flips when control is 1)
    • β111\beta |1\rangle |11\rangle becomes β101\beta |1\rangle |01\rangle (target flips when control is 1)

12(α000+α011+β110+β101)\frac{1}{\sqrt{2}} \Big(\alpha |0\rangle |00\rangle + \alpha |0\rangle |11\rangle + \beta |1\rangle |10\rangle + \beta |1\rangle |01\rangle\Big)

  • Alice then applies a Hadamard gate to her qubit (ψ|\psi\rangle).
    • H0=0+12H|0\rangle = \frac{|0\rangle + |1\rangle}{\sqrt{2}}
    • H1=012H|1\rangle = \frac{|0\rangle - |1\rangle}{\sqrt{2}}
  • To prepare for measurement, Bell measurement is performed on Alice's two qubits, which can be expressed as:
12[α(0+12)00+α(0+12)11+β(012)10+β(012)01]=12[00(α0+β1)+01(α1+β0)+10(α0β1)+11(α1β0)]\begin{aligned} \frac{1}{\sqrt{2}} \Bigg[&\alpha \left(\frac{|0\rangle + |1\rangle}{\sqrt{2}}\right) \otimes |00\rangle + \alpha \left(\frac{|0\rangle + |1\rangle}{\sqrt{2}}\right) \otimes |11\rangle \\ &+ \beta \left(\frac{|0\rangle - |1\rangle}{\sqrt{2}}\right) \otimes |10\rangle + \beta \left(\frac{|0\rangle - |1\rangle}{\sqrt{2}}\right) \otimes |01\rangle \Bigg] \\ =\, &\frac{1}{2} \Big[ |00\rangle (\alpha |0\rangle + \beta |1\rangle) + |01\rangle (\alpha |1\rangle + \beta |0\rangle) \\ &+ |10\rangle (\alpha |0\rangle - \beta |1\rangle) + |11\rangle (\alpha |1\rangle - \beta |0\rangle) \Big] \end{aligned}
  • Alice measures her two qubits, resulting in one of four possible outcomes corresponding to the classical bits mn|mn\rangle where m,n{0,1}m, n \in \{0, 1\}
    • Alice's two qubits: 00,01,10,11|00\rangle, |0 1\rangle, |1 0\rangle, |1 1\rangle
    • Bob's qubit is in a state that depends on Alice's measurement outcome

XnZmψX^n Z^m |\psi\rangle

mnBob's stateBob applies
00α0+β1\alpha \lvert 0\rangle + \beta \lvert 1\rangleI\mathbb{I}
01α1+β0\alpha \lvert 1\rangle + \beta \lvert 0\rangleXX
10α0β1\alpha \lvert 0\rangle - \beta \lvert 1\rangleZZ
11α1β0\alpha \lvert 1\rangle - \beta \lvert 0\rangleXZXZ

Channels

  • Classical Channel: carries bits (fibre, radio, paper, ...)
  • Quantum Channel: carries qubits (can also carry bits)
    • Quantum channels can emulate classical ones, but not vice versa
  • Teleportation: classical channel + pre-shared entanglement -> effective quantum channel

Summary

  • In superdense coding, by sending only one qubit and using pre-shared entanglement, Alice can transmit two classical bits of information.
  • The operation Xna=a(nmod2)X^n \lvert a \rangle = \lvert a \oplus (n \bmod 2) \rangle correctly defines the effect of applying the XX gate nn times.
  • The tensor product of two identity operators is the identity operator on the composite space. In symbols, where the subscript is the dimension: I2I2=I4I_2 \otimes I_2 = I_4.
  • The state ϕ=12(0+1)\lvert \phi \rangle = \frac{1}{\sqrt{2}}(\lvert 0 \rangle + \lvert 1 \rangle) is an eigenstate of the Pauli-XX gate with eigenvalue +1+1.
  • In the teleportation protocol, the classical communication channel is used to transmit two classical bits from Alice to Bob.
  • Three qubits are required for quantum teleportation.
  • After the teleportation protocol completes, Bob has a qubit in the state ψ\lvert \psi \rangle.
  • Of QASM, Qiskit, Cirq, and PennyLane, PennyLane is the only quantum language that spells out the full name of the Hadamard gate for its built-in gates.

TIM 004

· 3 min read

Innovation Ecosystem

  • A network of organizations, people and resources that interact with each other to develop and support new ideas, technologies and businesses.
  • Innovation Ecosystem
    • Start-up companies
    • Medical Centers
    • Mature Companies
    • City, Regional and State Organizations
    • Providers of Support Services (Legal, Accounting, etc)
    • Venture Capital Funds
    • Colleges & universities

Type of Innovation Ecosystem

  • Corporate innovation ecosystem
  • Digital innovation ecosystem
  • City-based innovation ecosystem
  • High-tech SMEs centered ecosystem (Small and Medium-sized Enterprises)
  • University-based ecosystem
  • Incubators and Accelerators ecosystems
  • Regional and National innovation ecosystems
  • Social innovation ecosystems

The core -> New Innovation Initiatives -> Startup Ecosystem -> Customers

Roles and Activites across Innovation Ecosystem

  • Leadership:
    • Ecosystem leader
      • ecosystem governance: decipher roles, coordinate interactions, orchestrate resource flows
      • forging partnerships: attract & link partners, create collaboration, stimulate complementary
      • platform management: build platform, open platform, orchestrate compleentors
      • value management: decipher bases of value, create & capture value
    • Dominator: integrate actors
  • Direct Value Creation:
    • Supplier: supply components
    • Assembler: assemble components
    • Complementor: provide complementarities
    • User: define need, provide ideas, purchase & use
  • Value Support:
    • Expert: generte knowledge, provide expertise, transfer technology
    • Champion: build connections, provide access to markets
  • Entrepreneur Ecosystem:
    • Entrepreneur: co-locate, set-up network
    • Sponsor: give resources, co-develop offering, link to other actors
    • Regulator: provide favorable conditions

Systems Thinking

  • a holistic and non-linear approach to problem solving that focuses on the interactions and patterns within an entire system
  • emphasizes relationships and feedback loops instead of analysing individual parts in isolation

Fishbone Diagram

  • common categories to help think broadly about the possible causes include:
  • People: skils, training, communication, motivation
  • Process / Methods: procedures, or workflows
  • Machines / Techonology: tools, equipment, software
  • Materials: resources or inputs used
  • Environment: workplace or external conditions
  • Mangement / Policies: decisions, rules, or leadership

Innovation Networks

  • connected systems of people working together toward shared objectives, often through internal teams and external partners such as suppliers, universities, accelerators, customers, and startups.
  • is also part of innovation ecosystem

Types of Innovation Networks

  • Enterpreneur-based
  • Internal project teams
  • Internal enterpreneur networks
  • Communities of practice: can involve players inside and across different organizations
  • Spatial clusters: like Silicon Valley, Boston, etc
  • Sectoral networks: bring different players together because they share a common sector
  • New product or process develoment consortium
  • New technology development consortium
  • Emerging standards: Exploring and estabilishing standards around innovative technologies
  • Supply chain learning
  • Learning networks
  • Recombinant innovation networks: Cross-sectoral groupings that alloow for networking across boundaries and the transfer of ieads.
  • Managed open innovation networks
  • User networks
  • Innovation markets
  • Crowdfunding and new resource approaches

Stakeholder Analysis

  • High power, highly interested people (Engage Closely)
  • High power, less interested people (Keep Satisfied)
  • Low power, Highly interested people (Keep Informed)
  • Low power, less interested people (Monitor)