[Proposed] 불균형 데이터 (def)-망

Author

김보람

Published

February 5, 2024

imports

import pandas as pd
import numpy as np
import sklearn
import pickle 
import time 
import datetime
import warnings
warnings.filterwarnings('ignore')
%run function_proposed_gcn.py
with open('fraudTrain.pkl', 'rb') as file:
    fraudTrain = pickle.load(file)    
df_results = try_1(fraudTrain, 0.3, 0.05, 8.028000e+04, 0.3)
df_results = try_1(fraudTrain, 0.3, 0.05, 8.028000e+04, 0.2, prev_results=df_results)
df_results = try_1(fraudTrain, 0.3, 0.005, 8.028000e+04, 0.2, prev_results=df_results)
df_results = try_1(fraudTrain, 0.2, 0.05, 8.028000e+04, 0.2, prev_results=df_results)
df_results = try_1(fraudTrain, 0.2, 0.005, 8.028000e+04, 0.2, prev_results=df_results)
df_results = try_1(fraudTrain, 0.3, 0.005, 8.028000e+04, 0.3, prev_results=df_results)
df_results = try_1(fraudTrain, 0.5, 0.5, 8.028000e+04, 0.3, prev_results=df_results)
df_results
model time acc pre rec f1 auc graph_based method throw_rate train_size train_cols train_frate test_size test_frate hyper_params theta gamma
0 GCN None 0.972028 0.652778 0.940000 0.770492 0.956856 True Proposed 0.3 20020 [level_0, trans_date_trans_time, cc_num, merch... 0.3 20020 0.050 None 80280.0 0.3
1 GCN None 0.972194 0.650794 0.956667 0.774629 0.964839 True Proposed 0.3 20020 [level_0, trans_date_trans_time, cc_num, merch... 0.3 20020 0.050 None 80280.0 0.2
2 GCN None 0.972694 0.143617 0.900000 0.247706 0.936529 True Proposed 0.3 20020 [level_0, trans_date_trans_time, cc_num, merch... 0.3 20020 0.005 None 80280.0 0.2
3 GCN None 0.977356 0.711340 0.920000 0.802326 0.950186 True Proposed 0.2 30030 [level_0, trans_date_trans_time, cc_num, merch... 0.2 30030 0.050 None 80280.0 0.2
4 GCN None 0.978910 0.171946 0.844444 0.285714 0.912015 True Proposed 0.2 30030 [level_0, trans_date_trans_time, cc_num, merch... 0.2 30030 0.005 None 80280.0 0.2
5 GCN None 0.971029 0.136364 0.900000 0.236842 0.935693 True Proposed 0.3 20020 [level_0, trans_date_trans_time, cc_num, merch... 0.3 20020 0.005 None 80280.0 0.3
6 GCN None 0.964752 0.969697 0.959467 0.964555 0.964750 True Proposed 0.5 12012 [level_0, trans_date_trans_time, cc_num, merch... 0.5 12012 0.500 None 80280.0 0.3

train_cols 수정 필요 -> amt만 나오게. ..

time……..(proposed는 시간이 의미가 있낭?)

일단 이렇게 나온 결과값을 엑셀로 저장해야함..

ymdhms = datetime.datetime.fromtimestamp(time.time()).strftime('%Y%m%d-%H%M%S') 
df_results.to_csv(f'./results/{ymdhms}-proposed.csv',index=False)