서두르세요! "데이스쿨 40% 할인" 12월 2일까지!
분석시각화 대회 코드 공유 게시물은
내용 확인 후
좋아요(투표) 가능합니다.
Recall@5 평가산식 코드
import pandas as pd import numpy as np def recall5(answer_df, submission_df): """ Calculate recall@5 for given dataframes. Parameters: - answer_df: DataFrame containing the ground truth - submission_df: DataFrame containing the predictions Returns: - recall: Recall@5 value """ primary_col = answer_df.columns[0] secondary_col = answer_df.columns[1] # Check if each primary_col entry has exactly 5 secondary_col predictions prediction_counts = submission_df.groupby(primary_col).size() if not all(prediction_counts == 5): raise ValueError(f"Each {primary_col} should have exactly 5 {secondary_col} predictions.") # Check for NULL values in the predicted secondary_col if submission_df[secondary_col].isnull().any(): raise ValueError(f"Predicted {secondary_col} contains NULL values.") # Check for duplicates in the predicted secondary_col for each primary_col duplicated_preds = submission_df.groupby(primary_col).apply(lambda x: x[secondary_col].duplicated().any()) if duplicated_preds.any(): raise ValueError(f"Predicted {secondary_col} contains duplicates for some {primary_col}.") # Filter the submission dataframe based on the primary_col present in the answer dataframe submission_df = submission_df[submission_df[primary_col].isin(answer_df[primary_col])] # For each primary_col, get the top 5 predicted secondary_col values top_5_preds = submission_df.groupby(primary_col).apply(lambda x: x[secondary_col].head(5).tolist()).to_dict() # Convert the answer_df to a dictionary for easier lookup true_dict = answer_df.groupby(primary_col).apply(lambda x: x[secondary_col].tolist()).to_dict() individual_recalls = [] for key, val in true_dict.items(): if key in top_5_preds: correct_matches = len(set(true_dict[key]) & set(top_5_preds[key])) individual_recall = correct_matches / min(len(val), 5) # 공정한 평가를 가능하게 위하여 분모(k)를 'min(len(val), 5)' 로 설정함 individual_recalls.append(individual_recall) recall = np.mean(individual_recalls) return recall
데이콘(주) | 대표 김국진 | 699-81-01021
통신판매업 신고번호: 제 2021-서울영등포-1704호
서울특별시 영등포구 은행로 3 익스콘벤처타워 901호
이메일 dacon@dacon.io | 전화번호: 070-4102-0545
Copyright ⓒ DACON Inc. All rights reserved
평가지표로 Recall@K 를 선정한 이유를 알 수 있을까요?