714 Commits

Author SHA1 Message Date
Glenn Jocher
b81beb0f5f updates 2019-12-07 22:55:26 -08:00
Glenn Jocher
1f943e886f updates 2019-12-07 15:17:29 -08:00
Glenn Jocher
55ba979816 updates 2019-12-07 01:26:41 -08:00
Glenn Jocher
bb54408f73 updates 2019-12-07 00:05:37 -08:00
Glenn Jocher
d5176e4fc4 updates 2019-12-07 00:01:18 -08:00
Glenn Jocher
2c0985f366 updates 2019-12-06 23:58:47 -08:00
Glenn Jocher
a066a7b8ea updates 2019-12-06 19:05:51 -08:00
Glenn Jocher
63c2736c12 updates 2019-12-04 23:02:32 -08:00
Glenn Jocher
0a04eb9ff1 updates 2019-12-04 15:15:42 -08:00
Glenn Jocher
a2dc8a6b5a updates 2019-12-04 15:15:23 -08:00
Glenn Jocher
93a70d958a updates 2019-12-02 11:31:19 -08:00
Glenn Jocher
3d91731519 updates 2019-12-01 14:07:09 -08:00
Glenn Jocher
e637ae44dd updates 2019-12-01 14:06:11 -08:00
Glenn Jocher
d6a7a614dc updates 2019-12-01 13:51:55 -08:00
Glenn Jocher
92690302bb updates 2019-12-01 13:49:38 -08:00
Glenn Jocher
e613bbc88c updates 2019-11-29 19:10:01 -08:00
Glenn Jocher
9e9a6a1425 updates 2019-11-27 15:50:29 -10:00
Glenn Jocher
82b62c9855 updates 2019-11-27 15:50:00 -10:00
Glenn Jocher
3c57ff7b1b updates 2019-11-25 17:24:05 -10:00
Glenn Jocher
75e8ec323f updates 2019-11-25 11:45:28 -10:00
Francisco Reveriano
26e3a28bee Update train.py for distributive programming (#655)
When attempting to running this function in a multi-GPU environment I kept on getting a runtime issue. I was able to solve this problem by passing this keyword. I first found the solution here: 
https://github.com/pytorch/pytorch/issues/22436
and in the pytorch tutorial

'RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by (1) passing the keyword argument find_unused_parameters=True to torch.nn.parallel.DistributedDataParallel; (2) making sure all forward function outputs participate in calculating loss. If you already have done the above two steps, then the distributed data parallel module wasn't able to locate the output tensors in the return value of your module's forward function. Please include the loss function and the structure of the return value of forward of your module when reporting this issue (e.g. list, dict, iterable). '
2019-11-24 22:21:36 -10:00
Glenn Jocher
7773651e8e updates 2019-11-24 18:38:30 -10:00
Glenn Jocher
f12a2a513a updates 2019-11-24 18:29:29 -10:00
Glenn Jocher
f38723c0bd updates 2019-11-20 19:34:22 -08:00
Glenn Jocher
3a4ed8b3ab updates 2019-11-20 13:40:24 -08:00
Glenn Jocher
bb209111c4 updates 2019-11-20 13:36:15 -08:00
Glenn Jocher
8e327e3bd0 updates 2019-11-20 13:33:25 -08:00
Glenn Jocher
2950f4c816 updates 2019-11-20 13:26:50 -08:00
Glenn Jocher
c14ea59c71 updates 2019-11-20 13:24:50 -08:00
Glenn Jocher
bd498ae776 updates 2019-11-20 13:14:24 -08:00
Glenn Jocher
e58f0a68b6 updates 2019-11-20 12:05:40 -08:00
Glenn Jocher
d355e539d9 updates 2019-11-19 18:47:22 -08:00
Glenn Jocher
d9805d2fb6 updates 2019-11-19 12:42:12 -08:00
Glenn Jocher
2ba1a4c9cc updates 2019-11-18 12:01:17 -08:00
Glenn Jocher
9c716a39c3 updates 2019-11-17 19:00:12 -08:00
Glenn Jocher
a1151c04a7 updates 2019-11-17 18:48:50 -08:00
Glenn Jocher
fe9ade6a64 updates 2019-11-16 12:07:19 -08:00
Glenn Jocher
985006a52a updates 2019-11-14 17:25:29 -08:00
Glenn Jocher
9daa5e858a updates 2019-11-14 17:22:09 -08:00
Glenn Jocher
fedc2150b3 updates 2019-11-14 17:12:55 -08:00
Glenn Jocher
6047be35cf updates 2019-11-14 15:08:58 -08:00
Glenn Jocher
a96e010251 updates 2019-11-14 15:07:27 -08:00
Glenn Jocher
579fdc57f8 updates 2019-11-09 10:56:38 -08:00
Glenn Jocher
97ac36ec6c updates 2019-11-08 10:19:46 -08:00
Glenn Jocher
d0e000b008 updates 2019-11-07 20:11:03 -08:00
Glenn Jocher
09ca721f88 updates 2019-11-06 10:10:53 -08:00
Glenn Jocher
f7f8bb23c2 updates 2019-11-04 16:34:45 -08:00
Glenn Jocher
fd3f2ed65f updates 2019-11-02 19:54:14 -07:00
Glenn Jocher
3ba7fc69b8 updates 2019-11-02 19:47:25 -07:00
Glenn Jocher
8d1ab548c1 updates 2019-10-25 11:04:10 -05:00