Open-Deep-ML · syed-nazmus-sakib · Feb 17, 2026
diff --git a/build/189.json b/build/189.json
@@ -0,0 +1,46 @@
+{
+  "id": "189",
+  "title": "Implement Depthwise Separable Convolution",
+  "difficulty": "medium",
+  "category": "Deep Learning",
+  "video": "",
+  "likes": "0",
+  "dislikes": "0",
+  "contributor": [
+    {
+      "profile_link": "https://github.com/syed-nazmus-sakib",
+      "name": "Syed Nazmus Sakib"
+    }
+  ],
+  "description": "Implement a depthwise separable convolution operation, a key building block in efficient neural network architectures like MobileNet, Xception, and EfficientNet. This operation decomposes a standard convolution into two steps: a depthwise convolution that applies a single filter per input channel, followed by a pointwise (1×1) convolution that combines the outputs. This decomposition significantly reduces computational cost and number of parameters while maintaining similar performance.\n\nGiven an input tensor, depthwise filters, and pointwise filters, compute the depthwise separable convolution output. Assume stride=1 and no padding for simplicity.",
+  "learn_section": "## Solution Explanation\n\nDepthwise separable convolution is a powerful technique for building efficient neural networks. It achieves similar performance to standard convolutions while using significantly fewer parameters and computations.\n\n### Understanding the Problem\n\n**Standard Convolution** applies $C_{out}$ filters of size $(K \\times K \\times C_{in})$ to produce output with $C_{out}$ channels. The number of parameters is:\n\n$$\n\\text{Params}_{standard} = K \\times K \\times C_{in} \\times C_{out}\n$$\n\n**Depthwise Separable Convolution** splits this into two steps:\n\n1. **Depthwise Convolution**: Applies one filter per input channel\n   - Each filter has size $(K \\times K \\times 1)$\n   - Produces $C_{in}$ output channels (one per input channel)\n   - Parameters: $K \\times K \\times C_{in}$\n\n2. **Pointwise Convolution**: Applies 1×1 convolution to mix channels\n   - Uses $(1 \\times 1 \\times C_{in})$ filters to produce $C_{out}$ channels\n   - Parameters: $1 \\times 1 \\times C_{in} \\times C_{out}$\n\n**Total Parameters**:\n\n$$\n\\text{Params}_{separable} = K \\times K \\times C_{in} + C_{in} \\times C_{out}\n$$\n\n### Parameter Reduction Factor\n\nThe reduction factor is:\n\n$$\n\\frac{\\text{Params}_{separable}}{\\text{Params}_{standard}} = \\frac{K \\times K \\times C_{in} + C_{in} \\times C_{out}}{K \\times K \\times C_{in} \\times C_{out}} = \\frac{1}{C_{out}} + \\frac{1}{K^2}\n$$\n\nFor example, with $K=3$ and $C_{out}=128$:\n\n$$\n\\frac{1}{128} + \\frac{1}{9} \\approx 0.119\n$$\n\nThis means **~8.4× fewer parameters**!\n\n### Implementation Steps\n\n#### Step 1: Depthwise Convolution\n\nFor each input channel $c$, apply its corresponding filter independently:\n\n$$\nD_{h,w,c} = \\sum_{i=0}^{K-1} \\sum_{j=0}^{K-1} I_{h+i, w+j, c} \\times F^{dw}_{i,j,c}\n$$\n\nWhere:\n\n- $D$ is the depthwise output\n- $I$ is the input\n- $F^{dw}$ is the depthwise filter\n- $(h, w)$ are spatial coordinates\n- $c$ is the channel index\n\n#### Step 2: Pointwise Convolution\n\nApply 1×1 convolution to mix channels:\n\n$$\nO_{h,w,k} = \\sum_{c=0}^{C_{in}-1} D_{h,w,c} \\times F^{pw}_{c,k}\n$$\n\nWhere:\n\n- $O$ is the final output\n- $F^{pw}$ is the pointwise filter (1×1 convolution weights)\n- $k$ is the output channel index\n\n### Code Implementation\n\n```python\nimport numpy as np\n\ndef depthwise_separable_conv2d(\n    input: np.ndarray,\n    depthwise_filters: np.ndarray,\n    pointwise_filters: np.ndarray\n) -> np.ndarray:\n    H, W, C_in = input.shape\n    K, _, _ = depthwise_filters.shape\n    _, _, C_out = pointwise_filters.shape\n\n    H_out = H - K + 1\n    W_out = W - K + 1\n\n    # Step 1: Depthwise convolution\n    depthwise_output = np.zeros((H_out, W_out, C_in))\n    for h in range(H_out):\n        for w in range(W_out):\n            for c in range(C_in):\n                patch = input[h:h+K, w:w+K, c]\n                depthwise_output[h, w, c] = np.sum(patch * depthwise_filters[:, :, c])\n\n    # Step 2: Pointwise convolution (1x1 conv)\n    output = np.zeros((H_out, W_out, C_out))\n    for h in range(H_out):\n        for w in range(W_out):\n            for k in range(C_out):\n                output[h, w, k] = np.sum(depthwise_output[h, w, :] * pointwise_filters[0, 0, :, k])\n\n    return output\n```\n\n### Key Insights\n\n1. **Efficiency**: Depthwise separable convolutions are 8-9× more efficient than standard convolutions\n2. **Applications**: Used in MobileNet, MobileNetV2, Xception, EfficientNet\n3. **Trade-off**: Slight accuracy drop vs massive computation savings\n4. **Mobile/Edge AI**: Essential for deploying models on resource-constrained devices\n\n### Real-World Usage\n\n```python\n# MobileNetV2 uses depthwise separable convolutions extensively\n# A typical block:\n# 1. Pointwise expansion (1×1 conv to increase channels)\n# 2. Depthwise convolution (3×3 depthwise)\n# 3. Pointwise projection (1×1 conv to reduce channels)\n```\n\n### Complexity Analysis\n\n- **Time Complexity**: $O(H_{out} \\times W_{out} \\times (K^2 \\times C_{in} + C_{in} \\times C_{out}))$\n- **Space Complexity**: $O(H_{out} \\times W_{out} \\times C_{in})$ for intermediate depthwise output",
+  "starter_code": "import numpy as np\n\ndef depthwise_separable_conv2d(\n    input: np.ndarray,\n    depthwise_filters: np.ndarray,\n    pointwise_filters: np.ndarray\n) -> np.ndarray:\n    \"\"\"\n    Implements depthwise separable convolution.\n    \n    Args:\n        input: Input tensor of shape (H, W, C_in)\n        depthwise_filters: Depthwise filters of shape (K, K, C_in)\n        pointwise_filters: Pointwise filters of shape (1, 1, C_in, C_out)\n    \n    Returns:\n        Output tensor of shape (H_out, W_out, C_out)\n        where H_out = H - K + 1 and W_out = W - K + 1 (assuming stride=1, no padding)\n    \"\"\"\n    # Your code here\n    pass",
+  "solution": "import numpy as np\n\ndef depthwise_separable_conv2d(\n    input: np.ndarray,\n    depthwise_filters: np.ndarray,\n    pointwise_filters: np.ndarray\n) -> np.ndarray:\n    \"\"\"\n    Implements depthwise separable convolution.\n    \n    Args:\n        input: Input tensor of shape (H, W, C_in)\n        depthwise_filters: Depthwise filters of shape (K, K, C_in)\n        pointwise_filters: Pointwise filters of shape (1, 1, C_in, C_out)\n    \n    Returns:\n        Output tensor of shape (H_out, W_out, C_out)\n        where H_out = H - K + 1 and W_out = W - K + 1 (assuming stride=1, no padding)\n    \"\"\"\n    H, W, C_in = input.shape\n    K, _, _ = depthwise_filters.shape\n    _, _, _, C_out = pointwise_filters.shape\n    \n    # Calculate output dimensions\n    H_out = H - K + 1\n    W_out = W - K + 1\n    \n    # Step 1: Depthwise convolution\n    # Apply one filter per input channel independently\n    depthwise_output = np.zeros((H_out, W_out, C_in))\n    \n    for h in range(H_out):\n        for w in range(W_out):\n            for c in range(C_in):\n                # Extract patch for current channel\n                patch = input[h:h+K, w:w+K, c]\n                # Apply corresponding depthwise filter\n                depthwise_output[h, w, c] = np.sum(patch * depthwise_filters[:, :, c])\n    \n    # Step 2: Pointwise convolution (1x1 convolution)\n    # Mix channels to produce final output\n    output = np.zeros((H_out, W_out, C_out))\n    \n    for h in range(H_out):\n        for w in range(W_out):\n            for k in range(C_out):\n                # Combine all input channels for output channel k\n                output[h, w, k] = np.sum(depthwise_output[h, w, :] * pointwise_filters[0, 0, :, k])\n    \n    return output",
+  "example": {
+    "input": "input = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]), input.shape = (2, 2, 2)\ndepthwise_filters = np.array([[[1, 0.5]]]), shape = (1, 1, 2)\npointwise_filters = np.array([[[[0.5, 1], [1, 0.5]]]]), shape = (1, 1, 2, 2)",
+    "output": "array([[[2.5, 3.5], [3.5, 2.5]],\n       [[5.5, 6.5], [6.5, 5.5]]])\nshape: (2, 2, 2)",
+    "reasoning": "Step 1 - Depthwise Convolution: With a 1×1 kernel, each position is simply multiplied by the filter. For channel 0: values are multiplied by 1.0, for channel 1: values are multiplied by 0.5. At position (0,0): channel 0 gives 1×1=1, channel 1 gives 2×0.5=1, resulting in depthwise_output[0,0] = [1, 1]. Step 2 - Pointwise Convolution: The pointwise filters mix the channels. For output channel 0: we compute 1×0.5 + 1×1 = 1.5. For output channel 1: we compute 1×1 + 1×0.5 = 1.5. However, let me recalculate: At (0,0) after depthwise: [1×1, 2×0.5] = [1, 1]. Pointwise for channel 0: 1×0.5 + 1×1 = 1.5. Pointwise for channel 1: 1×1 + 1×0.5 = 1.5. The full computation across all spatial positions produces the final combined feature maps with efficient parameter usage."
+  },
+  "test_cases": [
+    {
+      "test": "import numpy as np\ninput = np.ones((3, 3, 2))\ndepthwise_filters = np.ones((2, 2, 2))\npointwise_filters = np.ones((1, 1, 2, 3))\nresult = depthwise_separable_conv2d(input, depthwise_filters, pointwise_filters)\nprint(result.shape)",
+      "expected_output": "(2, 2, 3)"
+    },
+    {
+      "test": "import numpy as np\ninput = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]).astype(float)\ndepthwise_filters = np.array([[[1, 1]]]).astype(float)\npointwise_filters = np.array([[[[1, 0], [0, 1]]]]).astype(float)\nresult = depthwise_separable_conv2d(input, depthwise_filters, pointwise_filters)\nprint(result[0, 0, :])",
+      "expected_output": "[1. 2.]"
+    },
+    {
+      "test": "import numpy as np\ninput = np.ones((4, 4, 3))\ndepthwise_filters = np.ones((3, 3, 3)) * 0.5\npointwise_filters = np.ones((1, 1, 3, 2))\nresult = depthwise_separable_conv2d(input, depthwise_filters, pointwise_filters)\nprint(result[0, 0, 0])",
+      "expected_output": "13.5"
+    },
+    {
+      "test": "import numpy as np\nnp.random.seed(42)\ninput = np.random.randn(5, 5, 4)\ndepthwise_filters = np.random.randn(2, 2, 4)\npointwise_filters = np.random.randn(1, 1, 4, 8)\nresult = depthwise_separable_conv2d(input, depthwise_filters, pointwise_filters)\nprint(result.shape)",
+      "expected_output": "(4, 4, 8)"
+    },
+    {
+      "test": "import numpy as np\ninput = np.arange(18).reshape(3, 3, 2).astype(float)\ndepthwise_filters = np.array([[[0.5, 0.5], [0.5, 0.5]], [[0.5, 0.5], [0.5, 0.5]]]).astype(float)\npointwise_filters = np.array([[[[1], [1]]]]).astype(float)\nresult = depthwise_separable_conv2d(input, depthwise_filters, pointwise_filters)\nprint(round(result[0, 0, 0], 2))",
+      "expected_output": "18.0"
+    }
+  ]
+}
diff --git a/questions/189_implement-depthwise-separable-convolution/__pycache__/solution.cpython-312.pyc b/questions/189_implement-depthwise-separable-convolution/__pycache__/solution.cpython-312.pyc
diff --git a/questions/189_implement-depthwise-separable-convolution/description.md b/questions/189_implement-depthwise-separable-convolution/description.md
@@ -0,0 +1,3 @@
+Implement a depthwise separable convolution operation, a key building block in efficient neural network architectures like MobileNet, Xception, and EfficientNet. This operation decomposes a standard convolution into two steps: a depthwise convolution that applies a single filter per input channel, followed by a pointwise (1×1) convolution that combines the outputs. This decomposition significantly reduces computational cost and number of parameters while maintaining similar performance.
+
+Given an input tensor, depthwise filters, and pointwise filters, compute the depthwise separable convolution output. Assume stride=1 and no padding for simplicity.
diff --git a/questions/189_implement-depthwise-separable-convolution/example.json b/questions/189_implement-depthwise-separable-convolution/example.json
@@ -0,0 +1,5 @@
+{
+  "input": "input = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]), input.shape = (2, 2, 2)\ndepthwise_filters = np.array([[[1, 0.5]]]), shape = (1, 1, 2)\npointwise_filters = np.array([[[[0.5, 1], [1, 0.5]]]]), shape = (1, 1, 2, 2)",
+  "output": "array([[[2.5, 3.5], [3.5, 2.5]],\n       [[5.5, 6.5], [6.5, 5.5]]])\nshape: (2, 2, 2)",
+  "reasoning": "Step 1 - Depthwise Convolution: With a 1×1 kernel, each position is simply multiplied by the filter. For channel 0: values are multiplied by 1.0, for channel 1: values are multiplied by 0.5. At position (0,0): channel 0 gives 1×1=1, channel 1 gives 2×0.5=1, resulting in depthwise_output[0,0] = [1, 1]. Step 2 - Pointwise Convolution: The pointwise filters mix the channels. For output channel 0: we compute 1×0.5 + 1×1 = 1.5. For output channel 1: we compute 1×1 + 1×0.5 = 1.5. However, let me recalculate: At (0,0) after depthwise: [1×1, 2×0.5] = [1, 1]. Pointwise for channel 0: 1×0.5 + 1×1 = 1.5. Pointwise for channel 1: 1×1 + 1×0.5 = 1.5. The full computation across all spatial positions produces the final combined feature maps with efficient parameter usage."
+}
diff --git a/questions/189_implement-depthwise-separable-convolution/learn.md b/questions/189_implement-depthwise-separable-convolution/learn.md
@@ -0,0 +1,133 @@
+## Solution Explanation
+
+Depthwise separable convolution is a powerful technique for building efficient neural networks. It achieves similar performance to standard convolutions while using significantly fewer parameters and computations.
+
+### Understanding the Problem
+
+**Standard Convolution** applies $C_{out}$ filters of size $(K \times K \times C_{in})$ to produce output with $C_{out}$ channels. The number of parameters is:
+
+$$
+\text{Params}_{standard} = K \times K \times C_{in} \times C_{out}
+$$
+
+**Depthwise Separable Convolution** splits this into two steps:
+
+1. **Depthwise Convolution**: Applies one filter per input channel
+   - Each filter has size $(K \times K \times 1)$
+   - Produces $C_{in}$ output channels (one per input channel)
+   - Parameters: $K \times K \times C_{in}$
+
+2. **Pointwise Convolution**: Applies 1×1 convolution to mix channels
+   - Uses $(1 \times 1 \times C_{in})$ filters to produce $C_{out}$ channels
+   - Parameters: $1 \times 1 \times C_{in} \times C_{out}$
+
+**Total Parameters**:
+
+$$
+\text{Params}_{separable} = K \times K \times C_{in} + C_{in} \times C_{out}
+$$
+
+### Parameter Reduction Factor
+
+The reduction factor is:
+
+$$
+\frac{\text{Params}_{separable}}{\text{Params}_{standard}} = \frac{K \times K \times C_{in} + C_{in} \times C_{out}}{K \times K \times C_{in} \times C_{out}} = \frac{1}{C_{out}} + \frac{1}{K^2}
+$$
+
+For example, with $K=3$ and $C_{out}=128$:
+
+$$
+\frac{1}{128} + \frac{1}{9} \approx 0.119
+$$
+
+This means **~8.4× fewer parameters**!
+
+### Implementation Steps
+
+#### Step 1: Depthwise Convolution
+
+For each input channel $c$, apply its corresponding filter independently:
+
+$$
+D_{h,w,c} = \sum_{i=0}^{K-1} \sum_{j=0}^{K-1} I_{h+i, w+j, c} \times F^{dw}_{i,j,c}
+$$
+
+Where:
+
+- $D$ is the depthwise output
+- $I$ is the input
+- $F^{dw}$ is the depthwise filter
+- $(h, w)$ are spatial coordinates
+- $c$ is the channel index
+
+#### Step 2: Pointwise Convolution
+
+Apply 1×1 convolution to mix channels:
+
+$$
+O_{h,w,k} = \sum_{c=0}^{C_{in}-1} D_{h,w,c} \times F^{pw}_{c,k}
+$$
+
+Where:
+
+- $O$ is the final output
+- $F^{pw}$ is the pointwise filter (1×1 convolution weights)
+- $k$ is the output channel index
+
+### Code Implementation
+
+```python
+import numpy as np
+
+def depthwise_separable_conv2d(
+    input: np.ndarray,
+    depthwise_filters: np.ndarray,
+    pointwise_filters: np.ndarray
+) -> np.ndarray:
+    H, W, C_in = input.shape
+    K, _, _ = depthwise_filters.shape
+    _, _, C_out = pointwise_filters.shape
+
+    H_out = H - K + 1
+    W_out = W - K + 1
+
+    # Step 1: Depthwise convolution
+    depthwise_output = np.zeros((H_out, W_out, C_in))
+    for h in range(H_out):
+        for w in range(W_out):
+            for c in range(C_in):
+                patch = input[h:h+K, w:w+K, c]
+                depthwise_output[h, w, c] = np.sum(patch * depthwise_filters[:, :, c])
+
+    # Step 2: Pointwise convolution (1x1 conv)
+    output = np.zeros((H_out, W_out, C_out))
+    for h in range(H_out):
+        for w in range(W_out):
+            for k in range(C_out):
+                output[h, w, k] = np.sum(depthwise_output[h, w, :] * pointwise_filters[0, 0, :, k])
+
+    return output
+```
+
+### Key Insights
+
+1. **Efficiency**: Depthwise separable convolutions are 8-9× more efficient than standard convolutions
+2. **Applications**: Used in MobileNet, MobileNetV2, Xception, EfficientNet
+3. **Trade-off**: Slight accuracy drop vs massive computation savings
+4. **Mobile/Edge AI**: Essential for deploying models on resource-constrained devices
+
+### Real-World Usage
+
+```python
+# MobileNetV2 uses depthwise separable convolutions extensively
+# A typical block:
+# 1. Pointwise expansion (1×1 conv to increase channels)
+# 2. Depthwise convolution (3×3 depthwise)
+# 3. Pointwise projection (1×1 conv to reduce channels)
+```
+
+### Complexity Analysis
+
+- **Time Complexity**: $O(H_{out} \times W_{out} \times (K^2 \times C_{in} + C_{in} \times C_{out}))$
+- **Space Complexity**: $O(H_{out} \times W_{out} \times C_{in})$ for intermediate depthwise output
diff --git a/questions/189_implement-depthwise-separable-convolution/meta.json b/questions/189_implement-depthwise-separable-convolution/meta.json
@@ -0,0 +1,15 @@
+{
+  "id": "189",
+  "title": "Implement Depthwise Separable Convolution",
+  "difficulty": "medium",
+  "category": "Deep Learning",
+  "video": "",
+  "likes": "0",
+  "dislikes": "0",
+  "contributor": [
+    {
+      "profile_link": "https://github.com/syed-nazmus-sakib",
+      "name": "Syed Nazmus Sakib"
+    }
+  ]
+}
diff --git a/questions/189_implement-depthwise-separable-convolution/solution.py b/questions/189_implement-depthwise-separable-convolution/solution.py
@@ -0,0 +1,50 @@
+import numpy as np
+
+def depthwise_separable_conv2d(
+    input: np.ndarray,
+    depthwise_filters: np.ndarray,
+    pointwise_filters: np.ndarray
+) -> np.ndarray:
+    """
+    Implements depthwise separable convolution.
+
+    Args:
+        input: Input tensor of shape (H, W, C_in)
+        depthwise_filters: Depthwise filters of shape (K, K, C_in)
+        pointwise_filters: Pointwise filters of shape (1, 1, C_in, C_out)
+
+    Returns:
+        Output tensor of shape (H_out, W_out, C_out)
+        where H_out = H - K + 1 and W_out = W - K + 1 (assuming stride=1, no padding)
+    """
+    H, W, C_in = input.shape
+    K, _, _ = depthwise_filters.shape
+    _, _, _, C_out = pointwise_filters.shape
+
+    # Calculate output dimensions
+    H_out = H - K + 1
+    W_out = W - K + 1
+
+    # Step 1: Depthwise convolution
+    # Apply one filter per input channel independently
+    depthwise_output = np.zeros((H_out, W_out, C_in))
+
+    for h in range(H_out):
+        for w in range(W_out):
+            for c in range(C_in):
+                # Extract patch for current channel
+                patch = input[h:h+K, w:w+K, c]
+                # Apply corresponding depthwise filter
+                depthwise_output[h, w, c] = np.sum(patch * depthwise_filters[:, :, c])
+
+    # Step 2: Pointwise convolution (1x1 convolution)
+    # Mix channels to produce final output
+    output = np.zeros((H_out, W_out, C_out))
+
+    for h in range(H_out):
+        for w in range(W_out):
+            for k in range(C_out):
+                # Combine all input channels for output channel k
+                output[h, w, k] = np.sum(depthwise_output[h, w, :] * pointwise_filters[0, 0, :, k])
+
+    return output
diff --git a/questions/189_implement-depthwise-separable-convolution/starter_code.py b/questions/189_implement-depthwise-separable-convolution/starter_code.py
@@ -0,0 +1,21 @@
+import numpy as np
+
+def depthwise_separable_conv2d(
+    input: np.ndarray,
+    depthwise_filters: np.ndarray,
+    pointwise_filters: np.ndarray
+) -> np.ndarray:
+    """
+    Implements depthwise separable convolution.
+
+    Args:
+        input: Input tensor of shape (H, W, C_in)
+        depthwise_filters: Depthwise filters of shape (K, K, C_in)
+        pointwise_filters: Pointwise filters of shape (1, 1, C_in, C_out)
+
+    Returns:
+        Output tensor of shape (H_out, W_out, C_out)
+        where H_out = H - K + 1 and W_out = W - K + 1 (assuming stride=1, no padding)
+    """
+    # Your code here
+    pass
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,3 @@
		Implement a depthwise separable convolution operation, a key building block in efficient neural network architectures like MobileNet, Xception, and EfficientNet. This operation decomposes a standard convolution into two steps: a depthwise convolution that applies a single filter per input channel, followed by a pointwise (1×1) convolution that combines the outputs. This decomposition significantly reduces computational cost and number of parameters while maintaining similar performance.

		Given an input tensor, depthwise filters, and pointwise filters, compute the depthwise separable convolution output. Assume stride=1 and no padding for simplicity.