clip 사용 결과 공유

2025 Samsung Collegiate Programming Challenge : AI 챌린지

clip 사용 결과 공유

johyeongeseob

2025.06.30 18:27 1,403 Views

안녕하세요. 인공지능을 공부하고 있는 대학원생입니다.

저를 포함하여 참가자분들 중 일부는 멀티모달에 아직 생소하지 않을까? 라는 생각이 들었습니다.

그래서 공부한 내용을 서로 공유하면 서로가 더 많이 배울 것 같아 글을 작성하게 되었습니다.

만약 아직 감이 안 잡히는 분들께는 작은 도움이 된다면 좋을 것 같습니다.

아래는 pytorch기반으로 1차 테스트 데이터에 대해 간단히 구현한 사전 학습한 clip 코드입니다. (50줄짜리!)

______________________________________________________________________________________

환경 설정

windows11, GPU: NVIDIA GeForce RTX 3080 Ti, cuda: 12.6

conda create -n clip-env python=3.10

pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121

필요 라이브러리: pandas, torch, clip (openai), PIL

_______________________________________________________________________________________

import pandas as pd

import torch

import clip

from PIL import Image

device = "cuda" if torch.cuda.is_available() else "cpu"

model, preprocess = clip.load("ViT-B/32", device=device) # 2021 jan. 공개, parameter = 150M

# config

CSV_PATH = "./dev_test.csv"

SUBMISSION_PATH = "sample_submission.csv"

df = pd.read_csv(CSV_PATH)

submission = pd.read_csv(SUBMISSION_PATH)

all_data = [] # 전체 이미지에 대한 question+answer 리스트 모음

for _, row in df.iterrows():

index = row["ID"]

image_path = row["img_path"]

question = row["Question"]

choices = [row["A"], row["B"], row["C"], row["D"]]

texts = [f"{question} {choice}" for choice in choices]

all_data.append({

"ID": index,

"images": image_path,

"texts": texts

})

predictions = []

for item in all_data:

image = preprocess(Image.open(item["images"]).convert("RGB")).unsqueeze(0).to(device)

text_tokens = clip.tokenize(item["texts"]).to(device)

with torch.no_grad():

image_features = model.encode_image(image)

text_features = model.encode_text(text_tokens)

image_features /= image_features.norm(dim=-1, keepdim=True)

text_features /= text_features.norm(dim=-1, keepdim=True)

similarity = image_features @ text_features.T

pred_index = similarity.argmax().item()

pred_label = ["A", "B", "C", "D"][pred_index]

predictions.append(pred_label)

# 5. 결과를 submission에 저장

submission["answer"] = predictions

submission.to_csv("clip_submission.csv", index=False)

print("✅ clip_submission.csv 저장 완료!")

2 Comments

comment

0 / 1000

yeongjaeyou

2025.07.03 12:35

이건 점수가 어떻게 나오나요?

knobbylargely

2025.08.07 15:22

Deleted Comment

📣 The comment input field has been moved to the top of the comment list!

Current

clip 사용 결과 공유

Competition - 2025 Samsung Collegiate Programming Challenge : AI 챌린지

Likes 18

Comments 1

6달 전

환경 및 방법 질문

Competition - 거대 모델의 성능 저하 없이 크기를 줄이는 방법 : 2025 Samsung AI Challenge