아리랑 위성영상 AI 객체 검출 경진대회

train.py 실행 중 에러

2020.10.29 00:45 6,262 Views

아래와 같이 train.py 실행시켰을때 발생하는 에러인데 혹시 이런 에러의 원인을 아시는 분 계신가요?


2020-10-28 15:22:48.178559: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:697] Iteration = 0, topological sort failed with message: The graph couldn't be sorted in topological order.

2020-10-28 15:22:48.239040: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:697] Iteration = 1, topological sort failed with message: The graph couldn't be sorted in topological order.

2020-10-28 15:22:48.726170: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:533] remapper failed: Invalid argument: MutableGraphView::SortTopologically error: detected edge(s) creating cycle(s) {'gradients/cond_1/map/while/rotated_subsampling/Shape/Enter_grad/Switch_1' -> 'gradients/cond_1/map/while/rotated_subsampling/Shape/Enter_grad/concat_2', 'gradients/cond_1/map/while/Merge_2_grad/tuple/control_dependency_1' -> 'gradients/cond_1/map/while/TensorArrayWrite/TensorArrayWriteV3_grad/TensorArrayReadV3', 'gradients/cond_1/map/while/Merge_2_grad/tuple/control_dependency_1' -> 'gradients/cond_1/map/while/Switch_2_grad_1/NextIteration', 'gradients/cond_1/map/while/Merge_2_grad/tuple/control_dependency_1' -> 'gradients/cond_1/map/while/TensorArrayWrite/TensorArrayWriteV3_grad/tuple/group_deps', 'gradients/cond_1/map/while/TensorArrayWrite/TensorArrayWriteV3_grad/TensorArrayGrad/TensorArrayGradV3' -> 'gradients/cond_1/map/while/TensorArrayWrite/TensorArrayWriteV3_grad/TensorArrayReadV3'}.

2020-10-28 15:22:48.770432: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:533] arithmetic_optimizer failed: Invalid argument: The graph couldn't be sorted in topological order.

2020-10-28 15:22:48.808506: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:697] Iteration = 0, topological sort failed with message: The graph couldn't be sorted in topological order.

2020-10-28 15:22:48.848010: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:697] Iteration = 1, topological sort failed with message: The graph couldn't be sorted in topological order.

INFO:tensorflow:global_step/sec: 0

I1028 15:22:51.111244 140073284495104 supervisor.py:1099] global_step/sec: 0

2020-10-28 15:22:55.939256: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 37748736 exceeds 10% of system memory.

2020-10-28 15:22:56.120221: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 37748736 exceeds 10% of system memory.

2020-10-28 15:22:56.170123: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 37748736 exceeds 10% of system memory.

2020-10-28 15:22:56.328071: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 37748736 exceeds 10% of system memory.

2020-10-28 15:22:59.510732: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 603979776 exceeds 10% of system memory.

tcmalloc: large alloc 1176764416 bytes == 0xc1272000 @  0x7f65ab48cb6b 0x7f65ab4ac379 0x7f65948d8e47 0x7f65946d535f 0x7f659459fe9b 0x7f6594565876 0x7f6594566703 0x7f65945668d3 0x7f6597ca2eba 0x7f6594807dcc 0x7f65947fa535 0x7f65948baf81 0x7f65948b8678 0x7f65a9d8c6df 0x7f65aae6e6db 0x7f65ab1a7a3f

2020-10-28 15:23:01.755467: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at gather_nd_op.cc:47 : Invalid argument: indices[299,298,0] = [299, 298, -1] does not index into param shape [300,300,24]

2020-10-28 15:23:01.785950: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at gather_nd_op.cc:47 : Invalid argument: indices[299,298,0] = [299, 298, -1] does not index into param shape [300,300,24]

tcmalloc: large alloc 1450614784 bytes == 0x1d89c2000 @  0x7f65ab48cb6b 0x7f65ab4ac379 0x7f65948d8e47 0x7f65946d535f 0x7f659459fdbb 0x7f6594565876 0x7f6594566703 0x7f65945668d3 0x7f6597c3452b 0x7f6594807dcc 0x7f65947fa535 0x7f65948baf81 0x7f65948b8678 0x7f65a9d8c6df 0x7f65aae6e6db 0x7f65ab1a7a3f

2020-10-28 15:23:14.503888: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at gather_nd_op.cc:47 : Invalid argument: indices[57,86841,0] = [57, 86841, -1] does not index into param shape [58,86842,24]

2020-10-28 15:23:15.430577: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at gather_nd_op.cc:47 : Invalid argument: indices[57,86841,0] = [57, 86841, -1] does not index into param shape [58,86842,24]

INFO:tensorflow:Error reported to Coordinator: indices[299,298,0] = [299, 298, -1] does not index into param shape [300,300,24]

로그인이 필요합니다
0 / 1000
Khan
2020.10.29 02:20

희한하게 runtime restart하고 tensorflow를 cpu에서 gpu로 바꿔서 실행하니까 또 되네요... 정확히 뭐가 문제인지 잘 모르겠지만..