분석시각화 대회 코드 공유 게시물은
내용 확인 후
좋아요(투표) 가능합니다.
제2회 KRX 주식 투자 알고리즘 경진대회
Sharpe Index 코드 구현 에러 발생
안녕하세요! 현재 Sharpe_Index 에 대한 코드를 작성하는데에 문제가 생겼습니다! 샤프 공식 값은 -0.1815012976 이 나와야 하는데 올바른 값이 안 나옵니다. 샤프 공식 계산해보신 분들은 한번씩 봐주셔서 도와주시면 정말로 감사하겠습니다 ㅠㅠㅠ.
코드 :
import pandas as pd import numpy as np import matplotlib.pyplot as plt train_data = pd.read_csv("krx_data_june.csv", index_col=False) train_data = train_data.drop(["Unnamed: 0"], axis=1) train_data
Out[2]:날짜시가고가저가종가거래량거래대금등락률종목코드02023-05-312995344529353020401621461288341918300.17A06031012023-06-013000307028502900501954414708943755-3.97A06031022023-06-02290529502825287523581956798794605-0.86A06031032023-06-05285032202785293518938294575644756702.09A06031042023-06-072890306028552890464091613739139530-1.53A060310..............................299952023-06-1578507960770078101628821271366710-0.51A238490299962023-06-16792081507850815020889416763878704.35A238490299972023-06-198150815078707980110649880563360-2.09A238490299982023-06-2082508310782078201650701317821590-2.01A238490299992023-06-217830800076507660121772946568870-2.05A238490
30000 rows × 9 columns
In [3]:
df_submission = pd.read_csv('baseline_submission.csv') df_submission
Out[3]:종목코드순위0A00002015371A00004017342A00005013893A0000706034A000080540.........1995A3755006561996A37885016301997A3832206591998A38331018311999A383800573
2000 rows × 2 columns
In [4]:
top_200_codes = df_submission[df_submission['순위'].between(1, 200)]['종목코드'] filtered_train_data = train_data[train_data['종목코드'].isin(top_200_codes)] filtered_train_data
Out[4]:날짜시가고가저가종가거래량거래대금등락률종목코드1952023-05-31157201582015570156001718082695180360-0.76A0791601962023-06-011575016180156301615026165741950983903.53A0791601972023-06-021620016260159101616016383626420750900.06A0791601982023-06-05161701629015510155203379325315744800-3.96A0791601992023-06-07154901558015370153801245101922637100-0.90A079160..............................298002023-06-152140022300206002135032647169814201500.23A065510298012023-06-16212002125020550210002377494957491350-1.64A065510298022023-06-192090021350200002115017845336971960500.71A065510298032023-06-202110021600207002125018750339644919000.47A065510298042023-06-212120021950207502150016781535752620501.18A065510
3000 rows × 9 columns
In [5]:
data_0531 = filtered_train_data[filtered_train_data['날짜'] == '2023-05-31'] data_0621 = filtered_train_data[filtered_train_data['날짜'] == '2023-06-21'] plus_sum = (data_0621['종가'].values - data_0531['종가'].values) / data_0531['종가'].values * (-1)
In [6]:
bottom_200_codes = df_submission[df_submission['순위'].between(1801, 2000)]['종목코드'] filtered_train_data = train_data[train_data['종목코드'].isin(bottom_200_codes)]
In [7]:
data_0531 = filtered_train_data[filtered_train_data['날짜'] == '2023-05-31'] data_0621 = filtered_train_data[filtered_train_data['날짜'] == '2023-06-21'] minus_sum = (data_0621['종가'].values - data_0531['종가'].values) / data_0531['종가'].values
In [8]:
total_profit = (sum(plus_sum) + sum(minus_sum)) / 400 total_profit Out[8]:-0.003912638080208146
In [9]:
year_total_profit = total_profit * 250 / 15 year_total_profit Out[9]:-0.06521063467013577
In [ ]:
top_200_codes = df_submission[df_submission['순위'].between(1, 200)]['종목코드'] filtered_train_data = train_data[train_data['종목코드'].isin(top_200_codes)] filtered_train_data['날짜'] = pd.to_datetime(filtered_train_data['날짜'], format='%Y-%m-%d') filtered_train_data = filtered_train_data.sort_values(by='날짜') grouped_data = filtered_train_data.groupby('종목코드') plus_temp_mean = pd.DataFrame(columns=['종목코드', '일간수익률']) for code, group in grouped_data: group['일간수익률'] = (group['종가'] - group['종가'].shift(1)) / group['종가'].shift(1) daily_mean = group['일간수익률'].mean() plus_temp_mean = plus_temp_mean.append({'종목코드': code, '일간수익률': daily_mean}, ignore_index=True)
In [ ]:
bottom_200_codes = df_submission[df_submission['순위'].between(1801, 2000)]['종목코드'] filtered_train_data = train_data[train_data['종목코드'].isin(bottom_200_codes)] filtered_train_data['날짜'] = pd.to_datetime(filtered_train_data['날짜'], format='%Y-%m-%d') filtered_train_data = filtered_train_data.sort_values(by='날짜') grouped_data = filtered_train_data.groupby('종목코드') minus_temp_mean = pd.DataFrame(columns=['종목코드', '일간수익률']) for code, group in grouped_data: group['일간수익률'] = (group['종가'] - group['종가'].shift(1)) / group['종가'].shift(1) * (-1) daily_mean = group['일간수익률'].mean() minus_temp_mean = minus_temp_mean.append({'종목코드': code, '일간수익률': daily_mean}, ignore_index=True)
In [12]:
result = pd.concat([plus_temp_mean, minus_temp_mean], ignore_index=True) result
Out[12]:종목코드일간수익률0A000370-0.0051961A0004900.0023372A000760-0.0139803A0008800.0008974A0011200.010673.........395A347700-0.001082396A351330-0.002359397A3551500.004803398A373200-0.030547399A383310-0.002415
400 rows × 2 columns
In [15]:
sigma_sum = np.sqrt(((result['일간수익률'] - result['일간수익률'].mean()) ** 2).sum() / 13) sigma_sum Out[15]:0.05483072754357199
In [16]:
print((total_profit - 0.035) / sigma_sum) -0.7096866998397147
데이콘(주) | 대표 김국진 | 699-81-01021
통신판매업 신고번호: 제 2021-서울영등포-1704호
서울특별시 영등포구 은행로 3 익스콘벤처타워 901호
이메일 dacon@dacon.io | 전화번호: 070-4102-0545
Copyright ⓒ DACON Inc. All rights reserved
plus_sum = (data_0621['종가'].values - data_0531['종가'].values) / data_0531['종가'].values * (-1)
이부분 top200으로 Long인데 -1을 왜 곱하죠?