데이스쿨! 가을맞이 특별할인
분석시각화 대회 코드 공유 게시물은
내용 확인 후
좋아요(투표) 가능합니다.
Recall@5 평가산식 코드
import pandas as pd
import numpy as np
def recall5(answer_df, submission_df):
"""
Calculate recall@5 for given dataframes.
Parameters:
- answer_df: DataFrame containing the ground truth
- submission_df: DataFrame containing the predictions
Returns:
- recall: Recall@5 value
"""
primary_col = answer_df.columns[0]
secondary_col = answer_df.columns[1]
# Check if each primary_col entry has exactly 5 secondary_col predictions
prediction_counts = submission_df.groupby(primary_col).size()
if not all(prediction_counts == 5):
raise ValueError(f"Each {primary_col} should have exactly 5 {secondary_col} predictions.")
# Check for NULL values in the predicted secondary_col
if submission_df[secondary_col].isnull().any():
raise ValueError(f"Predicted {secondary_col} contains NULL values.")
# Check for duplicates in the predicted secondary_col for each primary_col
duplicated_preds = submission_df.groupby(primary_col).apply(lambda x: x[secondary_col].duplicated().any())
if duplicated_preds.any():
raise ValueError(f"Predicted {secondary_col} contains duplicates for some {primary_col}.")
# Filter the submission dataframe based on the primary_col present in the answer dataframe
submission_df = submission_df[submission_df[primary_col].isin(answer_df[primary_col])]
# For each primary_col, get the top 5 predicted secondary_col values
top_5_preds = submission_df.groupby(primary_col).apply(lambda x: x[secondary_col].head(5).tolist()).to_dict()
# Convert the answer_df to a dictionary for easier lookup
true_dict = answer_df.groupby(primary_col).apply(lambda x: x[secondary_col].tolist()).to_dict()
individual_recalls = []
for key, val in true_dict.items():
if key in top_5_preds:
correct_matches = len(set(true_dict[key]) & set(top_5_preds[key]))
individual_recall = correct_matches / min(len(val), 5) # 공정한 평가를 가능하게 위하여 분모(k)를 'min(len(val), 5)' 로 설정함
individual_recalls.append(individual_recall)
recall = np.mean(individual_recalls)
return recall
데이콘(주) | 대표 김국진 | 699-81-01021
통신판매업 신고번호: 제 2021-서울영등포-1704호
직업정보제공사업 신고번호: J1204020250004
서울특별시 영등포구 은행로 3 익스콘벤처타워 901호
이메일 dacon@dacon.io |
전화번호: 070-4102-0545
Copyright ⓒ DACON Inc. All rights reserved
평가지표로 Recall@K 를 선정한 이유를 알 수 있을까요?