このレポジトリはMetaのSAM2を用いて、動画の切り抜きを行い、背景透過素材(GB)を作成するためのものです。
SAM2公式のデモサイトの制限であるアップロード制限(70Mb)と出力fps(24)固定が無くなり、 より自由に切り抜きが行えるようになりました
- windows 11
- WSL2
- Geforce RTX 3060
- uv 0.6.14
- CUDA 11.8
- Nvidia Driver Version: 560.94
- Nvidia Driverのインストール
- CUDAのインストール(バージョン11.8推奨, 12.1ないし12.4でも実行できるはずではある)
- WSL2のインストール
- wsl上でuvをインストール
- 上記4つを終えた後以下のコマンドを実行
- data_for_guiフォルダのvideosフォルダに動画を入れる(英数字のみのファイル名推奨)
python mask_app.py --root_dir data_for_gui --checkpoint_dir checkpoints/sam2.1_hiera_tiny.pt --model_cfg configs/sam2.1/sam2.1_hiera_t.yamlを実行
wsl
mkdir sam2_test
cd sam2_test
uv init
git clone git@github.com:clean262/sam2.git
. .venv/bin/activate
# このタイミングでpyproject.tomlを変更する
uv sync # ここでpythonとpytorchのダウンロードが行われる
cd sam2
uv pip install requirements.txt
cd checkpoints
sed -i 's/\r$//' download_ckpts.sh
./download_ckpts.sh && cd ..以下のファイルがpyproject.tomlです。
CUDAのバージョンが11.8以外の方は以下のindex = "pytorch-cu118", name = "pytorch-cu118", url = "https://download.pytorch.org/whl/cu118"の部分を変更する必要があります。
CUDAのバージョンが12.1の方はindex = "pytorch-cu121", name = "pytorch-cu121", url = "https://download.pytorch.org/whl/cu121"に変更してください。
CUDAのバージョンが12.4の方はindex = "pytorch-cu124", name = "pytorch-cu124", url = "https://download.pytorch.org/whl/cu124"に変更してください。
PyTorchの詳しいインストールについて知りたい方はこちら
[project]
name = "sam2-test"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.10"
dependencies = [
"torch==2.5.1",
"torchvision==0.20.1",
]
[tool.uv.sources]
torch = [
{ index = "pytorch-cu118", marker = "sys_platform == 'linux' or sys_platform == 'win32'" },
]
torchvision = [
{ index = "pytorch-cu118", marker = "sys_platform == 'linux' or sys_platform == 'win32'" },
]
[[tool.uv.index]]
name = "pytorch-cu118"
url = "https://download.pytorch.org/whl/cu118"
explicit = trueThe SAM 2 model checkpoints, SAM 2 demo code (front-end and back-end), and SAM 2 training code are licensed under Apache 2.0, however the Inter Font and Noto Color Emoji used in the SAM 2 demo code are made available under the SIL Open Font License, version 1.1.
See contributing and the code of conduct.
The SAM 2 project was made possible with the help of many contributors (alphabetical):
Karen Bergan, Daniel Bolya, Alex Bosenberg, Kai Brown, Vispi Cassod, Christopher Chedeau, Ida Cheng, Luc Dahlin, Shoubhik Debnath, Rene Martinez Doehner, Grant Gardner, Sahir Gomez, Rishi Godugu, Baishan Guo, Caleb Ho, Andrew Huang, Somya Jain, Bob Kamma, Amanda Kallet, Jake Kinney, Alexander Kirillov, Shiva Koduvayur, Devansh Kukreja, Robert Kuo, Aohan Lin, Parth Malani, Jitendra Malik, Mallika Malhotra, Miguel Martin, Alexander Miller, Sasha Mitts, William Ngan, George Orlin, Joelle Pineau, Kate Saenko, Rodrick Shepard, Azita Shokrpour, David Soofian, Jonathan Torres, Jenny Truong, Sagar Vaze, Meng Wang, Claudette Ward, Pengchuan Zhang.
Third-party code: we use a GPU-based connected component algorithm adapted from cc_torch (with its license in LICENSE_cctorch) as an optional post-processing step for the mask predictions.
If you use SAM 2 or the SA-V dataset in your research, please use the following BibTeX entry.
@article{ravi2024sam2,
title={SAM 2: Segment Anything in Images and Videos},
author={Ravi, Nikhila and Gabeur, Valentin and Hu, Yuan-Ting and Hu, Ronghang and Ryali, Chaitanya and Ma, Tengyu and Khedr, Haitham and R{\"a}dle, Roman and Rolland, Chloe and Gustafson, Laura and Mintun, Eric and Pan, Junting and Alwala, Kalyan Vasudev and Carion, Nicolas and Wu, Chao-Yuan and Girshick, Ross and Doll{\'a}r, Piotr and Feichtenhofer, Christoph},
journal={arXiv preprint arXiv:2408.00714},
url={https://arxiv.org/abs/2408.00714},
year={2024}
}

