여행 상품 신청 여부 예측 AI 해커톤

[Private : 18위] 0.92215 코드공유!

2022.09.06 21:03 1,140 조회 language

코드 공유가 처음이라 많이 부족할 수도 있지만 도움이 되셨으면 좋겠습니다 지금까지 대회 참석 하신 분들 고생 많았습니다! 

코드
로그인이 필요합니다
0 / 1000
일등
2022.09.07 00:06

역시 전처리 잘하는건 센스죠~! 감사합니다ㅎ

kisooofficial
2022.09.07 00:59

train_set['MonthlyIncome'].fillna(train_set.groupby('Designation')['MonthlyIncome'].transform('mean'), inplace=True)
test_set['MonthlyIncome'].fillna(test_set.groupby('Designation')['MonthlyIncome'].transform('mean'), inplace=True)
print(train_set.describe) #(1955, 19)
print(train_set[train_set['MonthlyIncome'].notnull()].groupby(['Designation'])['MonthlyIncome'].mean())

train_set['NumberOfChildrenVisiting'].fillna(train_set.groupby('MaritalStatus')['NumberOfChildrenVisiting'].transform('mean'), inplace=True)
test_set['NumberOfChildrenVisiting'].fillna(test_set.groupby('MaritalStatus')['NumberOfChildrenVisiting'].transform('mean'), inplace=True)
train_set['NumberOfFollowups'].fillna(train_set.groupby('NumberOfChildrenVisiting')['NumberOfFollowups'].transform('mean'), inplace=True)
test_set['NumberOfFollowups'].fillna(test_set.groupby('NumberOfChildrenVisiting')['NumberOfFollowups'].transform('mean'), inplace=True)

test_set에서 test_set의 평균값으로 결측치를 대체하는 방법은 data leakage에 위배됩니다

shki
2022.09.07 10:29

kisooofficial님 말씀처럼 data leakage에 해당하는 코드네요

파이썬짱
2022.09.07 11:37

고소하겠습니다

shki
2022.09.07 12:25

네 기다리겠습니다