Actually, hours per week is closer to the training/holdout dataset for the synthetic dataset trained on 100 samples than for the ones on 400/1600/6400 ... only 25600 is then more accurate again. I assume that's a coincidence? Overall, I was a bit surprised to see in general how good the data trained on only 100 records already is!