제2회 KRX 주식 투자 알고리즘 경진대회

Sharpe Index 코드 구현 에러 발생

2023.07.25 21:17 1,825 Views

안녕하세요! 현재 Sharpe_Index 에 대한 코드를 작성하는데에 문제가 생겼습니다! 샤프 공식 값은 -0.1815012976 이 나와야 하는데 올바른 값이 안 나옵니다. 샤프 공식 계산해보신 분들은 한번씩 봐주셔서 도와주시면 정말로 감사하겠습니다 ㅠㅠㅠ.


코드 :

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

train_data = pd.read_csv("krx_data_june.csv", index_col=False)
train_data = train_data.drop(["Unnamed: 0"], axis=1)
train_data

Out[2]:날짜시가고가저가종가거래량거래대금등락률종목코드02023-05-312995344529353020401621461288341918300.17A06031012023-06-013000307028502900501954414708943755-3.97A06031022023-06-02290529502825287523581956798794605-0.86A06031032023-06-05285032202785293518938294575644756702.09A06031042023-06-072890306028552890464091613739139530-1.53A060310..............................299952023-06-1578507960770078101628821271366710-0.51A238490299962023-06-16792081507850815020889416763878704.35A238490299972023-06-198150815078707980110649880563360-2.09A238490299982023-06-2082508310782078201650701317821590-2.01A238490299992023-06-217830800076507660121772946568870-2.05A238490

30000 rows × 9 columns

In [3]:

df_submission = pd.read_csv('baseline_submission.csv')
df_submission

Out[3]:종목코드순위0A00002015371A00004017342A00005013893A0000706034A000080540.........1995A3755006561996A37885016301997A3832206591998A38331018311999A383800573

2000 rows × 2 columns

In [4]:

top_200_codes = df_submission[df_submission['순위'].between(1, 200)]['종목코드']
filtered_train_data = train_data[train_data['종목코드'].isin(top_200_codes)]
filtered_train_data

Out[4]:날짜시가고가저가종가거래량거래대금등락률종목코드1952023-05-31157201582015570156001718082695180360-0.76A0791601962023-06-011575016180156301615026165741950983903.53A0791601972023-06-021620016260159101616016383626420750900.06A0791601982023-06-05161701629015510155203379325315744800-3.96A0791601992023-06-07154901558015370153801245101922637100-0.90A079160..............................298002023-06-152140022300206002135032647169814201500.23A065510298012023-06-16212002125020550210002377494957491350-1.64A065510298022023-06-192090021350200002115017845336971960500.71A065510298032023-06-202110021600207002125018750339644919000.47A065510298042023-06-212120021950207502150016781535752620501.18A065510

3000 rows × 9 columns

In [5]:

data_0531 = filtered_train_data[filtered_train_data['날짜'] == '2023-05-31']
data_0621 = filtered_train_data[filtered_train_data['날짜'] == '2023-06-21']
plus_sum = (data_0621['종가'].values - data_0531['종가'].values) / data_0531['종가'].values * (-1)

In [6]:

bottom_200_codes = df_submission[df_submission['순위'].between(1801, 2000)]['종목코드']
filtered_train_data = train_data[train_data['종목코드'].isin(bottom_200_codes)]

In [7]:

data_0531 = filtered_train_data[filtered_train_data['날짜'] == '2023-05-31']
data_0621 = filtered_train_data[filtered_train_data['날짜'] == '2023-06-21']
minus_sum = (data_0621['종가'].values - data_0531['종가'].values) / data_0531['종가'].values

In [8]:

total_profit = (sum(plus_sum) + sum(minus_sum)) / 400
total_profit
Out[8]:-0.003912638080208146

In [9]:

year_total_profit = total_profit * 250 / 15
year_total_profit
Out[9]:-0.06521063467013577

In [ ]:

top_200_codes = df_submission[df_submission['순위'].between(1, 200)]['종목코드']

filtered_train_data = train_data[train_data['종목코드'].isin(top_200_codes)]

filtered_train_data['날짜'] = pd.to_datetime(filtered_train_data['날짜'], format='%Y-%m-%d')

filtered_train_data = filtered_train_data.sort_values(by='날짜')

grouped_data = filtered_train_data.groupby('종목코드')

plus_temp_mean = pd.DataFrame(columns=['종목코드', '일간수익률'])

for code, group in grouped_data:
    group['일간수익률'] = (group['종가'] - group['종가'].shift(1)) / group['종가'].shift(1)
    daily_mean = group['일간수익률'].mean()
    plus_temp_mean = plus_temp_mean.append({'종목코드': code, '일간수익률': daily_mean}, ignore_index=True)

In [ ]:

bottom_200_codes = df_submission[df_submission['순위'].between(1801, 2000)]['종목코드']

filtered_train_data = train_data[train_data['종목코드'].isin(bottom_200_codes)]

filtered_train_data['날짜'] = pd.to_datetime(filtered_train_data['날짜'], format='%Y-%m-%d')

filtered_train_data = filtered_train_data.sort_values(by='날짜')

grouped_data = filtered_train_data.groupby('종목코드')

minus_temp_mean = pd.DataFrame(columns=['종목코드', '일간수익률'])

for code, group in grouped_data:
    group['일간수익률'] = (group['종가'] - group['종가'].shift(1)) / group['종가'].shift(1) * (-1)
    daily_mean = group['일간수익률'].mean()
    minus_temp_mean = minus_temp_mean.append({'종목코드': code, '일간수익률': daily_mean}, ignore_index=True)

In [12]:

result = pd.concat([plus_temp_mean, minus_temp_mean], ignore_index=True)
result

Out[12]:종목코드일간수익률0A000370-0.0051961A0004900.0023372A000760-0.0139803A0008800.0008974A0011200.010673.........395A347700-0.001082396A351330-0.002359397A3551500.004803398A373200-0.030547399A383310-0.002415

400 rows × 2 columns

In [15]:

sigma_sum = np.sqrt(((result['일간수익률'] - result['일간수익률'].mean()) ** 2).sum() / 13)
sigma_sum
Out[15]:0.05483072754357199

In [16]:

print((total_profit - 0.035) / sigma_sum)
-0.7096866998397147


PDF
로그인이 필요합니다
0 / 1000
검은짱돌의계략
2023.07.26 06:02

plus_sum = (data_0621['종가'].values - data_0531['종가'].values) / data_0531['종가'].values * (-1)
이부분 top200으로 Long인데 -1을 왜 곱하죠?